summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorZuma copybara merger <zuma-automerger@google.com>2023-03-21 05:02:50 +0000
committerCopybara-Service <copybara-worker@google.com>2023-03-23 22:03:43 -0700
commit0a36ddb6b5b8c8b0de49b604b70eddd2d8b7ad54 (patch)
tree387863340c4f5e8e19fcd68494dabd6c2f3594c3
parent7bb5660317e125dee80c6adbb27d1a3b136c15f4 (diff)
downloadrio-0a36ddb6b5b8c8b0de49b604b70eddd2d8b7ad54.tar.gz
[Copybara Auto Merge] Merge branch zuma into android14-gs-pixel-5.15
edgetpu: Continue powering up if the block is still on Bug: 272701322 edgetpu: Add ABI documentation edgetpu: usage_stats add cluster reconfigurations counters Bug: 271372136 Bug: 271374892 edgetpu: usage_stats: process metrics v2 data Bug: 271372136 (repeat) Bug: 271374892 (repeat) Signed-off-by: Zuma copybara merger <zuma-automerger@google.com> GitOrigin-RevId: 599a31d4efcc191a247d4918b802758ec12e97ef Change-Id: Ice5e0584766693ecf7a760cca9336d51b556f5ea
-rw-r--r--Documentation/ABI/stable/sysfs-class-edgetpu205
-rw-r--r--Documentation/ABI/stable/thermal-cdev13
-rw-r--r--drivers/edgetpu/edgetpu-usage-stats.c335
-rw-r--r--drivers/edgetpu/edgetpu-usage-stats.h24
-rw-r--r--drivers/edgetpu/mobile-pm.c2
-rw-r--r--drivers/edgetpu/rio/config.h3
6 files changed, 415 insertions, 167 deletions
diff --git a/Documentation/ABI/stable/sysfs-class-edgetpu b/Documentation/ABI/stable/sysfs-class-edgetpu
new file mode 100644
index 0000000..ad63661
--- /dev/null
+++ b/Documentation/ABI/stable/sysfs-class-edgetpu
@@ -0,0 +1,205 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+# Firmware Management
+
+What: /sys/class/edgetpu/edgetpu-soc/device/load_firmware
+Date: January 2020
+Description:
+ To load a firmware file, echo the OS firmware location-relative path of the firmware
+ image file to load to this attribute. For example:
+ # echo google/my-test.fw > /sys/class/edgetpu/edgetpu-soc/device/load_firmware
+ cat this file to see the name of the currently-loaded firmware image.
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/firmware_type
+Date: August 2020
+Description:
+ “prod” or “test” firmware type/flavor. (Or “unknown” or “custom” or “stage 2
+ bootloader”.)
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/firmware_version
+Date: November 2020
+Description:
+ Firmware major.minor version from the header, plus VII and KCI version numbers and
+ google3 build CL:
+ # cat /sys/class/edgetpu/hermosa.0.0/device/firmware_version
+ 1.0 vii=2 kci=1 cl=371245025
+Users: Edge TPU runtime library (libedgetpu)
+
+# General Status
+
+What: /sys/class/edgetpu/edgetpu-soc/device/clients
+Date: July 2021
+Description:
+ List clients that have opened the device by process and thread IDs. Also shows
+ current wakelock counts for debugging which client is holding the device powered.
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/groups
+Date: August 2021
+Description:
+ List currently formed, forming, and disbanding device groups, with client PIDs and
+ amount of host and dma-buf memory mapped to the TPU, plus errors and VCIDs.
+Users: Edge TPU runtime library (libedgetpu)
+
+# Error Statistics
+# These statistics are maintained by the kernel driver.
+
+What: /sys/class/edgetpu/edgetpu-soc/device/firmware_crash_count
+Date: April 2021
+Description:
+ Count of “unrecoverable” firmware crash events; does not include “non-fatal” crashes
+ in non-privileged VII job processing code from which the firmware indicates it can
+ recover (that is, it only counts crashes in privileged firmware processing). (No
+ clear action.)
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/watchdog_timeout_count
+Date: April 2021
+Description:
+ Count of watchdog timeout events, including both host software watchdog timeouts
+ (that is, firmware fails to respond to periodic query by the host kernel) and
+ device-side watchdog timeout events sent from firmware (on Hermosa). (No clear
+ action.)
+Users: Edge TPU runtime library (libedgetpu)
+
+# Performance/Usage Statistics
+# These stats are gathered from the firmware periodically while the device is powered up, and also
+# at mobile power down time (or Hermosa device group disband time). Reading the sysfs file will
+# immediately poll for updated values if the TPU device is currently powered on; if the (mobile)
+# device is powered down then the last received value is returned. Some of these attributes are
+# only provided for certain chipsets as noted below.
+
+What: /sys/class/edgetpu/edgetpu-soc/device/tpu_usage
+Date: January 2021
+Description:
+ TPU usage duration in microseconds per “UID” (an Android app context ID for
+ Android/Pixel; on Hermosa the UID is always zero). Write to clear. Used for
+ Android battery consumption blaming.
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/tpu_utilization
+Date: February 2021
+Description:
+ TPU (GCB only) utilization as a percentage of time. (No clear action.)
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/device_utilization
+Date: February 2021
+Description:
+ Whole TPU device utilization as a percentage of time. (No clear action.)
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/tpu_active_cycle_count
+Date: March 2021
+Description:
+ Number of active TPU cycles since last reset (Mobile power down or Hermosa device
+ group disband). Write to clear.
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/tpu_throttle_stall_count
+Date: March 2021
+Description:
+ Number of hardware throttling stall cycles inserted since last reset (Mobile power
+ down or Hermosa device group disband). Write to clear. (Always zero on Abrolhos.)
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/inference_count
+Date: April 2021
+Description:
+ Number of graph invocations. Write to clear.
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/tpu_op_count
+Date: April 2021
+Description:
+ Number of TPU offload op invocations. Write to clear.
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/param_cache_hit_count
+Date: April 2021
+Description:
+ Number of times a TPU op invocation used its cached parameters. Write to clear.
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/param_cache_miss_count
+Date: April 2021
+Description:
+ Number of times a TPU op invocation had to cache its parameters. Write to clear.
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/context_preempt_count
+Date: April 2021
+Description:
+ Number of times an application/client context was preempted by another context.
+ Write to clear.
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/outstanding_commands_max
+Date: April 2021
+Description:
+ Maximum number of outstanding commands. Write to clear.
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/preempt_depth_max
+Date: April 2021
+Description:
+ Maximum number of preempted application/client contexts at any time. Write to
+ clear.
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/fw_thread_stats
+Date: April 2021
+Description:
+ Maximum stack depth per thread id. Write to clear.
+Users: Edge TPU runtime library (libedgetpu)
+
+# The following are not present or not meaningful (always zero) on Abrolhos or Hermosa; are present
+# on mobile Janeiro and beyond.
+
+What: /sys/class/edgetpu/edgetpu-soc/device/hardware_preempt_count
+Date: November 2021
+Description:
+ Number of times a hardware preemption occurred. Write to clear.
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/hardware_ctx_save_time
+Date: April 2022
+Description:
+ Hardware context save time in usecs. Write to clear.
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/hardware_ctx_save_time_max
+Date: April 2022
+Description:
+ Maximum time spent saving a hardware context, in usecs. Write to clear.
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/scalar_fence_wait_time
+Date: April 2022
+Description:
+ Total time spent waiting to hit a scalar fence during hardware preemption, in usecs.
+ Write to clear.
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/scalar_fence_wait_time_max
+Date: April 2022
+Description:
+ Maximum time spent waiting to hit a scalar fence during hardware preemption, in
+ usecs. Write to clear.
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/long_suspend_count
+Date: April 2022
+Description:
+ Count of “long” suspends (“number of times Pipeline::Suspend takes longer than
+ 5ms”). Write to clear.
+Users: Edge TPU runtime library (libedgetpu)
+
+What: /sys/class/edgetpu/edgetpu-soc/device/suspend_time_max
+Date: April 2022
+Description:
+ Maximum suspend time (“high water mark for time spent in Pipeline::Suspend”). Write
+ to clear.
+Users: Edge TPU runtime library (libedgetpu)
diff --git a/Documentation/ABI/stable/thermal-cdev b/Documentation/ABI/stable/thermal-cdev
new file mode 100644
index 0000000..3c457e2
--- /dev/null
+++ b/Documentation/ABI/stable/thermal-cdev
@@ -0,0 +1,13 @@
+# SPDX-License-Identifier: GPL-2.0-only
+
+What: /dev/thermal/cdev-by-name/tpu_cooling/user_vote
+Date: October 2021
+Description:
+ To set a thermal state vote.
+Users: Tj Thermal
+
+What: /dev/thermal/cdev-by-name/tpu_cooling/state2power_table
+Date: March 2023
+Description:
+ Thermal state to power consumption table for thermal throttliing.
+Users: Tj Thermal
diff --git a/drivers/edgetpu/edgetpu-usage-stats.c b/drivers/edgetpu/edgetpu-usage-stats.c
index 60751dd..9934ca6 100644
--- a/drivers/edgetpu/edgetpu-usage-stats.c
+++ b/drivers/edgetpu/edgetpu-usage-stats.c
@@ -74,6 +74,7 @@ int edgetpu_usage_add(struct edgetpu_dev *etdev, struct tpu_usage *tpu_usage)
if (!ustats)
return 0;
+ /* Note: as of metrics v2 the cluster_id is always zero and is ignored. */
etdev_dbg(etdev, "%s: uid=%u state=%u dur=%u", __func__,
tpu_usage->uid, tpu_usage->power_state,
tpu_usage->duration_us);
@@ -125,63 +126,78 @@ static void edgetpu_utilization_update(
mutex_unlock(&ustats->usage_stats_lock);
}
-static void edgetpu_counter_update(
- struct edgetpu_dev *etdev,
- struct edgetpu_usage_counter *counter)
+static void edgetpu_counter_update(struct edgetpu_dev *etdev, struct edgetpu_usage_counter *counter,
+ uint version)
{
struct edgetpu_usage_stats *ustats = etdev->usage_stats;
+ uint component = version > 1 ? counter->component_id : 0;
if (!ustats)
return;
- etdev_dbg(etdev, "%s: type=%d value=%llu\n", __func__,
- counter->type, counter->value);
+ etdev_dbg(etdev, "%s: type=%d value=%llu comp=%u\n", __func__, counter->type,
+ counter->value, component);
mutex_lock(&ustats->usage_stats_lock);
if (counter->type >= 0 && counter->type < EDGETPU_COUNTER_COUNT)
- ustats->counter[counter->type] += counter->value;
+ ustats->counter[counter->type][component] += counter->value;
mutex_unlock(&ustats->usage_stats_lock);
}
-static void edgetpu_counter_clear(
- struct edgetpu_dev *etdev,
- enum edgetpu_usage_counter_type counter_type)
+static void edgetpu_counter_clear(struct edgetpu_dev *etdev,
+ enum edgetpu_usage_counter_type counter_type)
{
struct edgetpu_usage_stats *ustats = etdev->usage_stats;
+ int i;
- if (!ustats)
- return;
if (counter_type >= EDGETPU_COUNTER_COUNT)
return;
mutex_lock(&ustats->usage_stats_lock);
- ustats->counter[counter_type] = 0;
+ for (i = 0; i < EDGETPU_TPU_CLUSTER_COUNT; i++)
+ ustats->counter[counter_type][i] = 0;
mutex_unlock(&ustats->usage_stats_lock);
}
-static void edgetpu_max_watermark_update(
- struct edgetpu_dev *etdev,
- struct edgetpu_usage_max_watermark *max_watermark)
+static void edgetpu_max_watermark_update(struct edgetpu_dev *etdev,
+ struct edgetpu_usage_max_watermark *max_watermark,
+ uint version)
{
struct edgetpu_usage_stats *ustats = etdev->usage_stats;
+ uint component = version > 1 ? max_watermark->component_id : 0;
if (!ustats)
return;
- etdev_dbg(etdev, "%s: type=%d value=%llu\n", __func__,
- max_watermark->type, max_watermark->value);
+ etdev_dbg(etdev, "%s: type=%d value=%llu comp=%u\n", __func__, max_watermark->type,
+ max_watermark->value, component);
if (max_watermark->type < 0 ||
max_watermark->type >= EDGETPU_MAX_WATERMARK_TYPE_COUNT)
return;
mutex_lock(&ustats->usage_stats_lock);
- if (max_watermark->value > ustats->max_watermark[max_watermark->type])
- ustats->max_watermark[max_watermark->type] =
+ if (max_watermark->value > ustats->max_watermark[max_watermark->type][component])
+ ustats->max_watermark[max_watermark->type][component] =
max_watermark->value;
mutex_unlock(&ustats->usage_stats_lock);
}
+static void edgetpu_max_watermark_clear(struct edgetpu_dev *etdev,
+ enum edgetpu_usage_max_watermark_type max_watermark_type)
+{
+ struct edgetpu_usage_stats *ustats = etdev->usage_stats;
+ int i;
+
+ if (max_watermark_type < 0 || max_watermark_type >= EDGETPU_MAX_WATERMARK_TYPE_COUNT)
+ return;
+
+ mutex_lock(&ustats->usage_stats_lock);
+ for (i = 0; i < EDGETPU_TPU_CLUSTER_COUNT; i++)
+ ustats->max_watermark[max_watermark_type][i] = 0;
+ mutex_unlock(&ustats->usage_stats_lock);
+}
+
static void edgetpu_thread_stats_update(
struct edgetpu_dev *etdev,
struct edgetpu_thread_stats *thread_stats)
@@ -288,19 +304,16 @@ void edgetpu_usage_stats_process_buffer(struct edgetpu_dev *etdev, void *buf)
etdev, &metric->component_activity);
break;
case EDGETPU_METRIC_TYPE_COUNTER:
- edgetpu_counter_update(etdev, &metric->counter);
+ edgetpu_counter_update(etdev, &metric->counter, version);
break;
case EDGETPU_METRIC_TYPE_MAX_WATERMARK:
- edgetpu_max_watermark_update(
- etdev, &metric->max_watermark);
+ edgetpu_max_watermark_update(etdev, &metric->max_watermark, version);
break;
case EDGETPU_METRIC_TYPE_THREAD_STATS:
- edgetpu_thread_stats_update(
- etdev, &metric->thread_stats);
+ edgetpu_thread_stats_update(etdev, &metric->thread_stats);
break;
case EDGETPU_METRIC_TYPE_DVFS_FREQUENCY_INFO:
- edgetpu_dvfs_frequency_update(
- etdev, metric->dvfs_frequency_info);
+ edgetpu_dvfs_frequency_update(etdev, metric->dvfs_frequency_info);
break;
default:
etdev_dbg(etdev, "%s: %d: skip unknown type=%u",
@@ -328,36 +341,72 @@ int edgetpu_usage_get_utilization(struct edgetpu_dev *etdev,
return val;
}
-static int64_t edgetpu_usage_get_counter(
- struct edgetpu_dev *etdev,
- enum edgetpu_usage_counter_type counter_type)
+/*
+ * Resyncs firmware stats and formats the requested counter in the supplied buffer.
+ *
+ * If @report_per_cluster is true, and if the firmware implements metrics V2 or higher,
+ * then one value is formatted per cluster (for chips with only one cluster only one value is
+ * formatted).
+ *
+ * Returns the number of bytes written to buf.
+ */
+static ssize_t edgetpu_usage_format_counter(struct edgetpu_dev *etdev, char *buf,
+ enum edgetpu_usage_counter_type counter_type,
+ bool report_per_cluster)
{
struct edgetpu_usage_stats *ustats = etdev->usage_stats;
- int64_t val;
+ uint ncomponents = report_per_cluster && !etdev->usage_stats->use_metrics_v1 ?
+ EDGETPU_TPU_CLUSTER_COUNT : 1;
+ uint i;
+ ssize_t ret = 0;
if (counter_type >= EDGETPU_COUNTER_COUNT)
- return -1;
+ return 0;
edgetpu_kci_update_usage(etdev);
mutex_lock(&ustats->usage_stats_lock);
- val = ustats->counter[counter_type];
+ for (i = 0; i < ncomponents; i++) {
+ if (i)
+ ret += scnprintf(buf + ret, PAGE_SIZE - ret, " ");
+ ret += scnprintf(buf + ret, PAGE_SIZE - ret, "%llu",
+ ustats->counter[counter_type][i]);
+ }
mutex_unlock(&ustats->usage_stats_lock);
- return val;
+ ret += scnprintf(buf + ret, PAGE_SIZE - ret, "\n");
+ return ret;
}
-static int64_t edgetpu_usage_get_max_watermark(
- struct edgetpu_dev *etdev,
- enum edgetpu_usage_max_watermark_type max_watermark_type)
+/*
+ * Resyncs firmware stats and formats the requested max watermark in the supplied buffer.
+ *
+ * If @report_per_cluster is true, and if the firmware implements metrics V2 or higher,
+ * then one value is formatted per cluster (for chips with only one cluster only one value is
+ * formatted).
+ *
+ * Returns the number of bytes written to buf.
+ */
+static ssize_t edgetpu_usage_format_max_watermark(
+ struct edgetpu_dev *etdev, char *buf,
+ enum edgetpu_usage_max_watermark_type max_watermark_type, bool report_per_cluster)
{
struct edgetpu_usage_stats *ustats = etdev->usage_stats;
- int64_t val;
+ uint ncomponents = report_per_cluster && !etdev->usage_stats->use_metrics_v1 ?
+ EDGETPU_TPU_CLUSTER_COUNT : 1;
+ uint i;
+ ssize_t ret = 0;
if (max_watermark_type >= EDGETPU_MAX_WATERMARK_TYPE_COUNT)
- return -1;
+ return 0;
edgetpu_kci_update_usage(etdev);
mutex_lock(&ustats->usage_stats_lock);
- val = ustats->max_watermark[max_watermark_type];
+ for (i = 0; i < ncomponents; i++) {
+ if (i)
+ ret += scnprintf(buf + ret, PAGE_SIZE - ret, " ");
+ ret += scnprintf(buf + ret, PAGE_SIZE - ret, "%llu",
+ ustats->max_watermark[max_watermark_type][i]);
+ }
mutex_unlock(&ustats->usage_stats_lock);
- return val;
+ ret += scnprintf(buf + ret, PAGE_SIZE - ret, "\n");
+ return ret;
}
static ssize_t tpu_usage_show(struct device *dev,
@@ -471,11 +520,8 @@ static ssize_t tpu_active_cycle_count_show(struct device *dev,
char *buf)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- int64_t val;
- val = edgetpu_usage_get_counter(etdev,
- EDGETPU_COUNTER_TPU_ACTIVE_CYCLES);
- return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+ return edgetpu_usage_format_counter(etdev, buf, EDGETPU_COUNTER_TPU_ACTIVE_CYCLES, false);
}
static ssize_t tpu_active_cycle_count_store(struct device *dev,
@@ -496,11 +542,8 @@ static ssize_t tpu_throttle_stall_count_show(struct device *dev,
char *buf)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- int64_t val;
- val = edgetpu_usage_get_counter(etdev,
- EDGETPU_COUNTER_TPU_THROTTLE_STALLS);
- return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+ return edgetpu_usage_format_counter(etdev, buf, EDGETPU_COUNTER_TPU_THROTTLE_STALLS, false);
}
static ssize_t tpu_throttle_stall_count_store(struct device *dev,
@@ -521,11 +564,8 @@ static ssize_t inference_count_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- int64_t val;
- val = edgetpu_usage_get_counter(etdev,
- EDGETPU_COUNTER_INFERENCES);
- return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+ return edgetpu_usage_format_counter(etdev, buf, EDGETPU_COUNTER_INFERENCES, true);
}
static ssize_t inference_count_store(struct device *dev,
@@ -541,21 +581,15 @@ static ssize_t inference_count_store(struct device *dev,
static DEVICE_ATTR(inference_count, 0664, inference_count_show,
inference_count_store);
-static ssize_t tpu_op_count_show(struct device *dev,
- struct device_attribute *attr, char *buf)
+static ssize_t tpu_op_count_show(struct device *dev, struct device_attribute *attr, char *buf)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- int64_t val;
- val = edgetpu_usage_get_counter(etdev,
- EDGETPU_COUNTER_TPU_OPS);
- return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+ return edgetpu_usage_format_counter(etdev, buf, EDGETPU_COUNTER_TPU_OPS, true);
}
-static ssize_t tpu_op_count_store(struct device *dev,
- struct device_attribute *attr,
- const char *buf,
- size_t count)
+static ssize_t tpu_op_count_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
@@ -564,22 +598,16 @@ static ssize_t tpu_op_count_store(struct device *dev,
}
static DEVICE_ATTR(tpu_op_count, 0664, tpu_op_count_show, tpu_op_count_store);
-static ssize_t param_cache_hit_count_show(struct device *dev,
- struct device_attribute *attr,
+static ssize_t param_cache_hit_count_show(struct device *dev, struct device_attribute *attr,
char *buf)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- int64_t val;
- val = edgetpu_usage_get_counter(etdev,
- EDGETPU_COUNTER_PARAM_CACHE_HITS);
- return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+ return edgetpu_usage_format_counter(etdev, buf, EDGETPU_COUNTER_PARAM_CACHE_HITS, false);
}
-static ssize_t param_cache_hit_count_store(struct device *dev,
- struct device_attribute *attr,
- const char *buf,
- size_t count)
+static ssize_t param_cache_hit_count_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
@@ -589,22 +617,16 @@ static ssize_t param_cache_hit_count_store(struct device *dev,
static DEVICE_ATTR(param_cache_hit_count, 0664, param_cache_hit_count_show,
param_cache_hit_count_store);
-static ssize_t param_cache_miss_count_show(struct device *dev,
- struct device_attribute *attr,
+static ssize_t param_cache_miss_count_show(struct device *dev, struct device_attribute *attr,
char *buf)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- int64_t val;
- val = edgetpu_usage_get_counter(etdev,
- EDGETPU_COUNTER_PARAM_CACHE_MISSES);
- return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+ return edgetpu_usage_format_counter(etdev, buf, EDGETPU_COUNTER_PARAM_CACHE_MISSES, false);
}
-static ssize_t param_cache_miss_count_store(struct device *dev,
- struct device_attribute *attr,
- const char *buf,
- size_t count)
+static ssize_t param_cache_miss_count_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
@@ -614,22 +636,16 @@ static ssize_t param_cache_miss_count_store(struct device *dev,
static DEVICE_ATTR(param_cache_miss_count, 0664, param_cache_miss_count_show,
param_cache_miss_count_store);
-static ssize_t context_preempt_count_show(struct device *dev,
- struct device_attribute *attr,
+static ssize_t context_preempt_count_show(struct device *dev, struct device_attribute *attr,
char *buf)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- int64_t val;
- val = edgetpu_usage_get_counter(etdev,
- EDGETPU_COUNTER_CONTEXT_PREEMPTS);
- return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+ return edgetpu_usage_format_counter(etdev, buf, EDGETPU_COUNTER_CONTEXT_PREEMPTS, true);
}
-static ssize_t context_preempt_count_store(struct device *dev,
- struct device_attribute *attr,
- const char *buf,
- size_t count)
+static ssize_t context_preempt_count_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
@@ -643,10 +659,8 @@ static ssize_t hardware_preempt_count_show(struct device *dev, struct device_att
char *buf)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- int64_t val;
- val = edgetpu_usage_get_counter(etdev, EDGETPU_COUNTER_HARDWARE_PREEMPTS);
- return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+ return edgetpu_usage_format_counter(etdev, buf, EDGETPU_COUNTER_HARDWARE_PREEMPTS, true);
}
static ssize_t hardware_preempt_count_store(struct device *dev, struct device_attribute *attr,
@@ -664,10 +678,9 @@ static ssize_t hardware_ctx_save_time_show(struct device *dev, struct device_att
char *buf)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- int64_t val;
- val = edgetpu_usage_get_counter(etdev, EDGETPU_COUNTER_HARDWARE_CTX_SAVE_TIME_US);
- return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+ return edgetpu_usage_format_counter(etdev, buf, EDGETPU_COUNTER_HARDWARE_CTX_SAVE_TIME_US,
+ true);
}
static ssize_t hardware_ctx_save_time_store(struct device *dev, struct device_attribute *attr,
@@ -685,10 +698,9 @@ static ssize_t scalar_fence_wait_time_show(struct device *dev, struct device_att
char *buf)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- int64_t val;
- val = edgetpu_usage_get_counter(etdev, EDGETPU_COUNTER_SCALAR_FENCE_WAIT_TIME_US);
- return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+ return edgetpu_usage_format_counter(etdev, buf, EDGETPU_COUNTER_SCALAR_FENCE_WAIT_TIME_US,
+ true);
}
static ssize_t scalar_fence_wait_time_store(struct device *dev, struct device_attribute *attr,
@@ -703,13 +715,11 @@ static DEVICE_ATTR(scalar_fence_wait_time, 0664, scalar_fence_wait_time_show,
scalar_fence_wait_time_store);
static ssize_t long_suspend_count_show(struct device *dev, struct device_attribute *attr,
- char *buf)
+ char *buf)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- int64_t val;
- val = edgetpu_usage_get_counter(etdev, EDGETPU_COUNTER_LONG_SUSPEND);
- return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+ return edgetpu_usage_format_counter(etdev, buf, EDGETPU_COUNTER_LONG_SUSPEND, false);
}
static ssize_t long_suspend_count_store(struct device *dev, struct device_attribute *attr,
@@ -723,15 +733,53 @@ static ssize_t long_suspend_count_store(struct device *dev, struct device_attrib
static DEVICE_ATTR(long_suspend_count, 0664, long_suspend_count_show,
long_suspend_count_store);
+#if EDGETPU_TPU_CLUSTER_COUNT > 1
+static ssize_t reconfigurations_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ struct edgetpu_dev *etdev = dev_get_drvdata(dev);
+
+ return edgetpu_usage_format_counter(etdev, buf, EDGETPU_COUNTER_RECONFIGURATIONS, false);
+}
+
+static ssize_t reconfigurations_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct edgetpu_dev *etdev = dev_get_drvdata(dev);
+
+ edgetpu_counter_clear(etdev, EDGETPU_COUNTER_RECONFIGURATIONS);
+ return count;
+}
+static DEVICE_ATTR(reconfigurations, 0664, reconfigurations_show, reconfigurations_store);
+
+static ssize_t preempt_reconfigurations_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct edgetpu_dev *etdev = dev_get_drvdata(dev);
+
+ return edgetpu_usage_format_counter(etdev, buf, EDGETPU_COUNTER_PREEMPT_RECONFIGURATIONS,
+ false);
+}
+
+static ssize_t preempt_reconfigurations_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct edgetpu_dev *etdev = dev_get_drvdata(dev);
+
+ edgetpu_counter_clear(etdev, EDGETPU_COUNTER_PREEMPT_RECONFIGURATIONS);
+ return count;
+}
+static DEVICE_ATTR(preempt_reconfigurations, 0664, preempt_reconfigurations_show,
+ preempt_reconfigurations_store);
+#endif /* EDGETPU_TPU_CLUSTER_COUNT > 1 */
+
+
static ssize_t outstanding_commands_max_show(
struct device *dev, struct device_attribute *attr, char *buf)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- int64_t val;
- val = edgetpu_usage_get_max_watermark(
- etdev, EDGETPU_MAX_WATERMARK_OUT_CMDS);
- return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+ return edgetpu_usage_format_max_watermark(etdev, buf, EDGETPU_MAX_WATERMARK_OUT_CMDS,
+ false);
}
static ssize_t outstanding_commands_max_store(
@@ -739,14 +787,8 @@ static ssize_t outstanding_commands_max_store(
const char *buf, size_t count)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- struct edgetpu_usage_stats *ustats = etdev->usage_stats;
-
- if (ustats) {
- mutex_lock(&ustats->usage_stats_lock);
- ustats->max_watermark[EDGETPU_MAX_WATERMARK_OUT_CMDS] = 0;
- mutex_unlock(&ustats->usage_stats_lock);
- }
+ edgetpu_max_watermark_clear(etdev, EDGETPU_MAX_WATERMARK_OUT_CMDS);
return count;
}
static DEVICE_ATTR(outstanding_commands_max, 0664,
@@ -757,11 +799,9 @@ static ssize_t preempt_depth_max_show(
struct device *dev, struct device_attribute *attr, char *buf)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- int64_t val;
- val = edgetpu_usage_get_max_watermark(
- etdev, EDGETPU_MAX_WATERMARK_PREEMPT_DEPTH);
- return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+ return edgetpu_usage_format_max_watermark(etdev, buf, EDGETPU_MAX_WATERMARK_PREEMPT_DEPTH,
+ true);
}
static ssize_t preempt_depth_max_store(
@@ -769,14 +809,8 @@ static ssize_t preempt_depth_max_store(
const char *buf, size_t count)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- struct edgetpu_usage_stats *ustats = etdev->usage_stats;
-
- if (ustats) {
- mutex_lock(&ustats->usage_stats_lock);
- ustats->max_watermark[EDGETPU_MAX_WATERMARK_PREEMPT_DEPTH] = 0;
- mutex_unlock(&ustats->usage_stats_lock);
- }
+ edgetpu_max_watermark_clear(etdev, EDGETPU_MAX_WATERMARK_PREEMPT_DEPTH);
return count;
}
static DEVICE_ATTR(preempt_depth_max, 0664, preempt_depth_max_show,
@@ -786,11 +820,10 @@ static ssize_t hardware_ctx_save_time_max_show(
struct device *dev, struct device_attribute *attr, char *buf)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- int64_t val;
- val = edgetpu_usage_get_max_watermark(
- etdev, EDGETPU_MAX_WATERMARK_HARDWARE_CTX_SAVE_TIME_US);
- return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+ return edgetpu_usage_format_max_watermark(etdev, buf,
+ EDGETPU_MAX_WATERMARK_HARDWARE_CTX_SAVE_TIME_US,
+ true);
}
static ssize_t hardware_ctx_save_time_max_store(
@@ -798,14 +831,8 @@ static ssize_t hardware_ctx_save_time_max_store(
const char *buf, size_t count)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- struct edgetpu_usage_stats *ustats = etdev->usage_stats;
-
- if (ustats) {
- mutex_lock(&ustats->usage_stats_lock);
- ustats->max_watermark[EDGETPU_MAX_WATERMARK_HARDWARE_CTX_SAVE_TIME_US] = 0;
- mutex_unlock(&ustats->usage_stats_lock);
- }
+ edgetpu_max_watermark_clear(etdev, EDGETPU_MAX_WATERMARK_HARDWARE_CTX_SAVE_TIME_US);
return count;
}
static DEVICE_ATTR(hardware_ctx_save_time_max, 0664, hardware_ctx_save_time_max_show,
@@ -815,11 +842,9 @@ static ssize_t scalar_fence_wait_time_max_show(
struct device *dev, struct device_attribute *attr, char *buf)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- int64_t val;
- val = edgetpu_usage_get_max_watermark(
- etdev, EDGETPU_MAX_WATERMARK_SCALAR_FENCE_WAIT_TIME_US);
- return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+ return edgetpu_usage_format_max_watermark(
+ etdev, buf, EDGETPU_MAX_WATERMARK_SCALAR_FENCE_WAIT_TIME_US, true);
}
static ssize_t scalar_fence_wait_time_max_store(
@@ -827,14 +852,8 @@ static ssize_t scalar_fence_wait_time_max_store(
const char *buf, size_t count)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- struct edgetpu_usage_stats *ustats = etdev->usage_stats;
-
- if (ustats) {
- mutex_lock(&ustats->usage_stats_lock);
- ustats->max_watermark[EDGETPU_MAX_WATERMARK_SCALAR_FENCE_WAIT_TIME_US] = 0;
- mutex_unlock(&ustats->usage_stats_lock);
- }
+ edgetpu_max_watermark_clear(etdev, EDGETPU_MAX_WATERMARK_SCALAR_FENCE_WAIT_TIME_US);
return count;
}
static DEVICE_ATTR(scalar_fence_wait_time_max, 0664, scalar_fence_wait_time_max_show,
@@ -844,11 +863,9 @@ static ssize_t suspend_time_max_show(
struct device *dev, struct device_attribute *attr, char *buf)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- int64_t val;
- val = edgetpu_usage_get_max_watermark(
- etdev, EDGETPU_MAX_WATERMARK_SUSPEND_TIME_US);
- return scnprintf(buf, PAGE_SIZE, "%llu\n", val);
+ return edgetpu_usage_format_max_watermark(etdev, buf, EDGETPU_MAX_WATERMARK_SUSPEND_TIME_US,
+ false);
}
static ssize_t suspend_time_max_store(
@@ -856,14 +873,8 @@ static ssize_t suspend_time_max_store(
const char *buf, size_t count)
{
struct edgetpu_dev *etdev = dev_get_drvdata(dev);
- struct edgetpu_usage_stats *ustats = etdev->usage_stats;
-
- if (ustats) {
- mutex_lock(&ustats->usage_stats_lock);
- ustats->max_watermark[EDGETPU_MAX_WATERMARK_SUSPEND_TIME_US] = 0;
- mutex_unlock(&ustats->usage_stats_lock);
- }
+ edgetpu_max_watermark_clear(etdev, EDGETPU_MAX_WATERMARK_SUSPEND_TIME_US);
return count;
}
static DEVICE_ATTR(suspend_time_max, 0664, suspend_time_max_show,
@@ -924,6 +935,10 @@ static struct attribute *usage_stats_dev_attrs[] = {
&dev_attr_hardware_ctx_save_time.attr,
&dev_attr_scalar_fence_wait_time.attr,
&dev_attr_long_suspend_count.attr,
+#if EDGETPU_TPU_CLUSTER_COUNT > 1
+ &dev_attr_reconfigurations.attr,
+ &dev_attr_preempt_reconfigurations.attr,
+#endif
&dev_attr_outstanding_commands_max.attr,
&dev_attr_preempt_depth_max.attr,
&dev_attr_hardware_ctx_save_time_max.attr,
diff --git a/drivers/edgetpu/edgetpu-usage-stats.h b/drivers/edgetpu/edgetpu-usage-stats.h
index ee908e1..2d97043 100644
--- a/drivers/edgetpu/edgetpu-usage-stats.h
+++ b/drivers/edgetpu/edgetpu-usage-stats.h
@@ -13,6 +13,9 @@
/* The highest version of usage metrics handled by this driver. */
#define EDGETPU_USAGE_METRIC_VERSION 2
+/* Max # of TPU clusters accounted for in the highest supported metrics version. */
+#define EDGETPU_USAGE_CLUSTERS_MAX 3
+
/*
* Size in bytes of usage metric v1.
* If fewer bytes than this are received then discard the invalid buffer.
@@ -54,6 +57,7 @@ struct tpu_usage {
/* Compute Core: TPU cluster ID. */
/* Called core_id in FW. */
+ /* Note: as of metrics v2 the cluster_id is always zero and is ignored. */
uint8_t cluster_id;
/* Reserved. Filling out the next 32-bit boundary. */
uint8_t reserved[3];
@@ -69,6 +73,7 @@ enum edgetpu_usage_component {
/* Just the TPU core (scalar core and tiles) */
EDGETPU_USAGE_COMPONENT_TPU = 1,
/* Control core (ARM Cortex-R52 CPU) */
+ /* Note: this component is not reported as of metrics v2. */
EDGETPU_USAGE_COMPONENT_CONTROLCORE = 2,
EDGETPU_USAGE_COMPONENT_COUNT = 3, /* number of components above */
@@ -114,10 +119,16 @@ enum edgetpu_usage_counter_type {
/* The following counters are added in metrics v2. */
- /* Number of context switches on a compute core. */
+ /* Counter 11 not used on TPU. */
EDGETPU_COUNTER_CONTEXT_SWITCHES = 11,
- EDGETPU_COUNTER_COUNT = 12, /* number of counters above */
+ /* Number of TPU Cluster Reconfigurations. */
+ EDGETPU_COUNTER_RECONFIGURATIONS = 12,
+
+ /* Number of TPU Cluster Reconfigurations motivated exclusively by a preemption. */
+ EDGETPU_COUNTER_PREEMPT_RECONFIGURATIONS = 13,
+
+ EDGETPU_COUNTER_COUNT = 14, /* number of counters above */
};
/* Generic counter. Only reported if it has a value larger than 0. */
@@ -173,10 +184,11 @@ struct __packed edgetpu_usage_max_watermark {
/* Must be kept in sync with firmware enum class UsageTrackerThreadId. */
enum edgetpu_usage_threadid {
/* Individual thread IDs do not have identifiers assigned. */
- /* Thread ID 14, used for other IP, is not used for TPU */
+
+ /* Thread ID 14 is not used for TPU */
/* Number of task identifiers. */
- EDGETPU_FW_THREAD_COUNT = 14,
+ EDGETPU_FW_THREAD_COUNT = 17,
};
/* Statistics related to a single thread in firmware. */
@@ -225,8 +237,8 @@ struct edgetpu_usage_stats {
DECLARE_HASHTABLE(uid_hash_table, UID_HASH_BITS);
/* component utilization values reported by firmware */
int32_t component_utilization[EDGETPU_USAGE_COMPONENT_COUNT];
- int64_t counter[EDGETPU_COUNTER_COUNT];
- int64_t max_watermark[EDGETPU_MAX_WATERMARK_TYPE_COUNT];
+ int64_t counter[EDGETPU_COUNTER_COUNT][EDGETPU_USAGE_CLUSTERS_MAX];
+ int64_t max_watermark[EDGETPU_MAX_WATERMARK_TYPE_COUNT][EDGETPU_USAGE_CLUSTERS_MAX];
int32_t thread_stack_max[EDGETPU_FW_THREAD_COUNT];
struct mutex usage_stats_lock;
};
diff --git a/drivers/edgetpu/mobile-pm.c b/drivers/edgetpu/mobile-pm.c
index 53571e0..1e0cbb5 100644
--- a/drivers/edgetpu/mobile-pm.c
+++ b/drivers/edgetpu/mobile-pm.c
@@ -213,7 +213,7 @@ static int mobile_power_up(void *data)
usleep_range(BLOCK_DOWN_MIN_DELAY_US, BLOCK_DOWN_MAX_DELAY_US);
} while (++times < BLOCK_DOWN_RETRY_TIMES);
if (times >= BLOCK_DOWN_RETRY_TIMES && !platform_pwr->is_block_down(etdev))
- return -EAGAIN;
+ etdev_warn(etdev, "Block is still on\n");
}
etdev_info(etdev, "Powering up\n");
diff --git a/drivers/edgetpu/rio/config.h b/drivers/edgetpu/rio/config.h
index 4434a2e..db6ae79 100644
--- a/drivers/edgetpu/rio/config.h
+++ b/drivers/edgetpu/rio/config.h
@@ -26,6 +26,9 @@
/* Pre-allocate 1 IOMMU domain per VCID */
#define EDGETPU_NUM_PREALLOCATED_DOMAINS EDGETPU_NUM_VCIDS
+/* Number of TPU clusters for metrics handling. */
+#define EDGETPU_TPU_CLUSTER_COUNT 3
+
/* Placeholder value */
#define EDGETPU_TZ_MAILBOX_ID 31