aboutsummaryrefslogtreecommitdiff
path: root/pw_metric/docs.rst
blob: 21360dd24ce231a77b61d9f85ba0a47d76b4326c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
.. _module-pw_metric:

=========
pw_metric
=========

.. attention::
   This module is **not yet production ready**; ask us if you are interested in
   using it out or have ideas about how to improve it.

--------
Overview
--------
Pigweed's metric module is a **lightweight manual instrumentation system** for
tracking system health metrics like counts or set values. For example,
``pw_metric`` could help with tracking the number of I2C bus writes, or the
number of times a buffer was filled before it could drain in time, or safely
incrementing counters from ISRs.

Key features of ``pw_metric``:

- **Tokenized names** - Names are tokenized using the ``pw_tokenizer`` enabling
  long metric names that don't bloat your binary.

- **Tree structure** - Metrics can form a tree, enabling grouping of related
  metrics for clearer organization.

- **Per object collection** - Metrics and groups can live on object instances
  and be flexibly combined with metrics from other instances.

- **Global registration** - For legacy code bases or just because it's easier,
  ``pw_metric`` supports automatic aggregation of metrics. This is optional but
  convenient in many cases.

- **Simple design** - There are only two core data structures: ``Metric`` and
  ``Group``, which are both simple to understand and use. The only type of
  metric supported is ``uint32_t`` and ``float``. This module does not support
  complicated aggregations like running average or min/max.

Example: Instrumenting a single object
--------------------------------------
The below example illustrates what instrumenting a class with a metric group
and metrics might look like. In this case, the object's
``MySubsystem::metrics()`` member is not globally registered; the user is on
their own for combining this subsystem's metrics with others.

.. code-block::

   #include "pw_metric/metric.h"

   class MySubsystem {
    public:
     void DoSomething() {
       attempts_.Increment();
       if (ActionSucceeds()) {
         successes_.Increment();
       }
     }
     Group& metrics() { return metrics_; }

    private:
     PW_METRIC_GROUP(metrics_, "my_subsystem");
     PW_METRIC(metrics_, attempts_, "attempts", 0u);
     PW_METRIC(metrics_, successes_, "successes", 0u);
   };

The metrics subsystem has no canonical output format at this time, but a JSON
dump might look something like this:

.. code-block:: none

  {
    "my_subsystem" : {
      "successes" : 1000,
      "attempts" : 1200,
    }
  }

In this case, every instance of ``MySubsystem`` will have unique counters.

Example: Instrumenting a legacy codebase
----------------------------------------
A common situation in embedded development is **debugging legacy code** or code
which is hard to change; where it is perhaps impossible to plumb metrics
objects around with dependency injection. The alternative to plumbing metrics
is to register the metrics through a global mechanism. ``pw_metric`` supports
this use case. For example:

**Before instrumenting:**

.. code-block::

   // This code was passed down from generations of developers before; no one
   // knows what it does or how it works. But it needs to be fixed!
   void OldCodeThatDoesntWorkButWeDontKnowWhy() {
     if (some_variable) {
       DoSomething();
     } else {
       DoSomethingElse();
     }
   }

**After instrumenting:**

.. code-block::

   #include "pw_metric/global.h"
   #include "pw_metric/metric.h"

   PW_METRIC_GLOBAL(legacy_do_something, "legacy_do_something");
   PW_METRIC_GLOBAL(legacy_do_something_else, "legacy_do_something_else");

   // This code was passed down from generations of developers before; no one
   // knows what it does or how it works. But it needs to be fixed!
   void OldCodeThatDoesntWorkButWeDontKnowWhy() {
     if (some_variable) {
       legacy_do_something.Increment();
       DoSomething();
     } else {
       legacy_do_something_else.Increment();
       DoSomethingElse();
     }
   }

In this case, the developer merely had to add the metrics header, define some
metrics, and then start incrementing them. These metrics will be available
globally through the ``pw::metric::global_metrics`` object defined in
``pw_metric/global.h``.

Why not just use simple counter variables?
------------------------------------------
One might wonder what the point of leveraging a metric library is when it is
trivial to make some global variables and print them out. There are a few
reasons:

- **Metrics offload** - To make it easy to get metrics off-device by sharing
  the infrastructure for offloading.

- **Consistent format** - To get the metrics in a consistent format (e.g.
  protobuf or JSON) for analysis

- **Uncoordinated collection** - To provide a simple and reliable way for
  developers on a team to all collect metrics for their subsystems, without
  having to coordinate to offload. This could extend to code in libraries
  written by other teams.

- **Pre-boot or interrupt visibility** - Some of the most challenging bugs come
  from early system boot when not all system facilities are up (e.g. logging or
  UART). In those cases, metrics provide a low-overhead approach to understand
  what is happening. During early boot, metrics can be incremented, then after
  boot dumping the metrics provides insights into what happened. While basic
  counter variables can work in these contexts too, one still has to deal with
  the offloading problem; which the library handles.

---------------------
Metrics API reference
---------------------

The metrics API consists of just a few components:

- The core data structures ``pw::metric::Metric`` and ``pw::metric::Group``
- The macros for scoped metrics and groups ``PW_METRIC`` and
  ``PW_METRIC_GROUP``
- The macros for globally registered metrics and groups
  ``PW_METRIC_GLOBAL`` and ``PW_METRIC_GROUP_GLOBAL``
- The global groups and metrics list: ``pw::metric::global_groups`` and
  ``pw::metric::global_metrics``.

Metric
------
The ``pw::metric::Metric`` provides:

- A 31-bit tokenized name
- A 1-bit discriminator for int or float
- A 32-bit payload (int or float)
- A 32-bit next pointer (intrusive list)

The metric object is 12 bytes on 32-bit platforms.

.. cpp:class:: pw::metric::Metric

   .. cpp:function:: Increment(uint32_t amount = 0)

      Increment the metric by the given amount. Results in undefined behaviour if
      the metric is not of type int.

   .. cpp:function:: Set(uint32_t value)

      Set the metric to the given value. Results in undefined behaviour if the
      metric is not of type int.

   .. cpp:function:: Set(float value)

      Set the metric to the given value. Results in undefined behaviour if the
      metric is not of type float.

Group
-----
The ``pw::metric::Group`` object is simply:

- A name for the group
- A list of children groups
- A list of leaf metrics groups
- A 32-bit next pointer (intrusive list)

The group object is 16 bytes on 32-bit platforms.

.. cpp:class:: pw::metric::Group

   .. cpp:function:: Dump(int indent_level = 0)

      Recursively dump a metrics group to ``pw_log``. Produces output like:

      .. code-block:: none

         "$6doqFw==": {
           "$05OCZw==": {
             "$VpPfzg==": 1,
             "$LGPMBQ==": 1.000000,
             "$+iJvUg==": 5,
           }
           "$9hPNxw==": 65,
           "$oK7HmA==": 13,
           "$FCM4qQ==": 0,
         }

      Note the metric names are tokenized with base64. Decoding requires using
      the Pigweed detokenizer. With a detokenizing-enabled logger, you could get
      something like:

      .. code-block:: none

         "i2c_1": {
           "gyro": {
             "num_sampleses": 1,
             "init_time_us": 1.000000,
             "initialized": 5,
           }
           "bus_errors": 65,
           "transactions": 13,
           "bytes_sent": 0,
         }

Macros
------
The **macros are the primary mechanism for creating metrics**, and should be
used instead of directly constructing metrics or groups. The macros handle
tokenizing the metric and group names.

.. cpp:function:: PW_METRIC(identifier, name, value)
.. cpp:function:: PW_METRIC(group, identifier, name, value)
.. cpp:function:: PW_METRIC_STATIC(identifier, name, value)
.. cpp:function:: PW_METRIC_STATIC(group, identifier, name, value)

   Declare a metric, optionally adding it to a group.

   - **identifier** - An identifier name for the created variable or member.
     For example: ``i2c_transactions`` might be used as a local or global
     metric; inside a class, could be named according to members
     (``i2c_transactions_`` for Google's C++ style).
   - **name** - The string name for the metric. This will be tokenized. There
     are no restrictions on the contents of the name; however, consider
     restricting these to be valid C++ identifiers to ease integration with
     other systems.
   - **value** - The initial value for the metric. Must be either a floating
     point value (e.g. ``3.2f``) or unsigned int (e.g. ``21u``).
   - **group** - A ``pw::metric::Group`` instance. If provided, the metric is
     added to the given group.

   The macro declares a variable or member named "name" with type
   ``pw::metric::Metric``, and works in three contexts: global, local, and
   member.

   If the `_STATIC` variant is used, the macro declares a variable with static
   storage. These can be used in function scopes, but not in classes.

   1. At global scope:

      .. code-block::

         PW_METRIC(foo, "foo", 15.5f);

         void MyFunc() {
           foo.Increment();
         }

   2. At local function or member function scope:

      .. code-block::

         void MyFunc() {
           PW_METRIC(foo, "foo", 15.5f);
           foo.Increment();
           // foo goes out of scope here; be careful!
         }

   3. At member level inside a class or struct:

      .. code-block::

         struct MyStructy {
           void DoSomething() {
             somethings.Increment();
           }
           // Every instance of MyStructy will have a separate somethings counter.
           PW_METRIC(somethings, "somethings", 0u);
         }

   You can also put a metric into a group with the macro. Metrics can belong to
   strictly one group, otherwise an assertion will fail. Example:

   .. code-block::

      PW_METRIC_GROUP(my_group, "my_group");
      PW_METRIC(my_group, foo, "foo", 0.2f);
      PW_METRIC(my_group, bar, "bar", 44000u);
      PW_METRIC(my_group, zap, "zap", 3.14f);

   .. tip::
      If you want a globally registered metric, see ``pw_metric/global.h``; in
      that contexts, metrics are globally registered without the need to
      centrally register in a single place.

.. cpp:function:: PW_METRIC_GROUP(identifier, name)
.. cpp:function:: PW_METRIC_GROUP(parent_group, identifier, name)
.. cpp:function:: PW_METRIC_GROUP_STATIC(identifier, name)
.. cpp:function:: PW_METRIC_GROUP_STATIC(parent_group, identifier, name)

   Declares a ``pw::metric::Group`` with name name; the name is tokenized.
   Works similar to ``PW_METRIC`` and can be used in the same contexts (global,
   local, and member). Optionally, the group can be added to a parent group.

   If the `_STATIC` variant is used, the macro declares a variable with static
   storage. These can be used in function scopes, but not in classes.

   Example:

   .. code-block::

      PW_METRIC_GROUP(my_group, "my_group");
      PW_METRIC(my_group, foo, "foo", 0.2f);
      PW_METRIC(my_group, bar, "bar", 44000u);
      PW_METRIC(my_group, zap, "zap", 3.14f);

.. cpp:function:: PW_METRIC_GLOBAL(identifier, name, value)

   Declare a ``pw::metric::Metric`` with name name, and register it in the
   global metrics list ``pw::metric::global_metrics``.

   Example:

   .. code-block::

      #include "pw_metric/metric.h"
      #include "pw_metric/global.h"

      // No need to coordinate collection of foo and bar; they're autoregistered.
      PW_METRIC_GLOBAL(foo, "foo", 0.2f);
      PW_METRIC_GLOBAL(bar, "bar", 44000u);

   Note that metrics defined with ``PW_METRIC_GLOBAL`` should never be added to
   groups defined with ``PW_METRIC_GROUP_GLOBAL``. Each metric can only belong
   to one group, and metrics defined with ``PW_METRIC_GLOBAL`` are
   pre-registered with the global metrics list.

   .. attention::
      Do not create ``PW_METRIC_GLOBAL`` instances anywhere other than global
      scope. Putting these on an instance (member context) would lead to dangling
      pointers and misery. Metrics are never deleted or unregistered!

.. cpp:function:: PW_METRIC_GROUP_GLOBAL(identifier, name, value)

   Declare a ``pw::metric::Group`` with name name, and register it in the
   global metric groups list ``pw::metric::global_groups``.

   Note that metrics created with ``PW_METRIC_GLOBAL`` should never be added to
   groups! Instead, just create a freestanding metric and register it into the
   global group (like in the example below).

   Example:

   .. code-block::

      #include "pw_metric/metric.h"
      #include "pw_metric/global.h"

      // No need to coordinate collection of this group; it's globally registered.
      PW_METRIC_GROUP_GLOBAL(leagcy_system, "legacy_system");
      PW_METRIC(leagcy_system, foo, "foo",0.2f);
      PW_METRIC(leagcy_system, bar, "bar",44000u);

   .. attention::
      Do not create ``PW_METRIC_GROUP_GLOBAL`` instances anywhere other than
      global scope. Putting these on an instance (member context) would lead to
      dangling pointers and misery. Metrics are never deleted or unregistered!

----------------------
Usage & Best Practices
----------------------
This library makes several tradeoffs to enable low memory use per-metric, and
one of those tradeoffs results in requiring care in constructing the metric
trees.

Use the Init() pattern for static objects with metrics
------------------------------------------------------
A common pattern in embedded systems is to allocate many objects globally, and
reduce reliance on dynamic allocation (or eschew malloc entirely). This leads
to a pattern where rich/large objects are statically constructed at global
scope, then interacted with via tasks or threads. For example, consider a
hypothetical global ``Uart`` object:

.. code-block::

   class Uart {
    public:
     Uart(span<std::byte> rx_buffer, span<std::byte> tx_buffer)
       : rx_buffer_(rx_buffer), tx_buffer_(tx_buffer) {}

     // Send/receive here...

    private:
     pw::span<std::byte> rx_buffer;
     pw::span<std::byte> tx_buffer;
   };

   std::array<std::byte, 512> uart_rx_buffer;
   std::array<std::byte, 512> uart_tx_buffer;
   Uart uart1(uart_rx_buffer, uart_tx_buffer);

Through the course of building a product, the team may want to add metrics to
the UART to for example gain insight into which operations are triggering lots
of data transfer. When adding metrics to the above imaginary UART object, one
might consider the following approach:

.. code-block::

   class Uart {
    public:
     Uart(span<std::byte> rx_buffer,
          span<std::byte> tx_buffer,
          Group& parent_metrics)
       : rx_buffer_(rx_buffer),
         tx_buffer_(tx_buffer) {
         // PROBLEM! parent_metrics may not be constructed if it's a reference
         // to a static global.
         parent_metrics.Add(tx_bytes_);
         parent_metrics.Add(rx_bytes_);
      }

     // Send/receive here which increment tx/rx_bytes.

    private:
     pw::span<std::byte> rx_buffer;
     pw::span<std::byte> tx_buffer;

     PW_METRIC(tx_bytes_, "tx_bytes", 0);
     PW_METRIC(rx_bytes_, "rx_bytes", 0);
   };

   PW_METRIC_GROUP(global_metrics, "/");
   PW_METRIC_GROUP(global_metrics, uart1_metrics, "uart1");

   std::array<std::byte, 512> uart_rx_buffer;
   std::array<std::byte, 512> uart_tx_buffer;
   Uart uart1(uart_rx_buffer,
              uart_tx_buffer,
              uart1_metrics);

However, this **is incorrect**, since the ``parent_metrics`` (pointing to
``uart1_metrics`` in this case) may not be constructed at the point of
``uart1`` getting constructed. Thankfully in the case of ``pw_metric`` this
will result in an assertion failure (or it will work correctly if the
constructors are called in a favorable order), so the problem will not go
unnoticed.  Instead, consider using the ``Init()`` pattern for static objects,
where references to dependencies may only be stored during construction, but no
methods on the dependencies are called.

Instead, the ``Init()`` approach separates global object construction into two
phases: The constructor where references are stored, and a ``Init()`` function
which is called after all static constructors have run. This approach works
correctly, even when the objects are allocated globally:

.. code-block::

   class Uart {
    public:
     // Note that metrics is not passed in here at all.
     Uart(span<std::byte> rx_buffer,
          span<std::byte> tx_buffer)
       : rx_buffer_(rx_buffer),
         tx_buffer_(tx_buffer) {}

      // Precondition: parent_metrics is already constructed.
      void Init(Group& parent_metrics) {
         parent_metrics.Add(tx_bytes_);
         parent_metrics.Add(rx_bytes_);
      }

     // Send/receive here which increment tx/rx_bytes.

    private:
     pw::span<std::byte> rx_buffer;
     pw::span<std::byte> tx_buffer;

     PW_METRIC(tx_bytes_, "tx_bytes", 0);
     PW_METRIC(rx_bytes_, "rx_bytes", 0);
   };

   PW_METRIC_GROUP(root_metrics, "/");
   PW_METRIC_GROUP(root_metrics, uart1_metrics, "uart1");

   std::array<std::byte, 512> uart_rx_buffer;
   std::array<std::byte, 512> uart_tx_buffer;
   Uart uart1(uart_rx_buffer,
              uart_tx_buffer);

   void main() {
     // uart1_metrics is guaranteed to be initialized by this point, so it is
     safe to pass it to Init().
     uart1.Init(uart1_metrics);
   }

.. attention::
   Be extra careful about **static global metric registration**. Consider using
   the ``Init()`` pattern.

Metric member order matters in objects
--------------------------------------
The order of declaring in-class groups and metrics matters if the metrics are
within a group declared inside the class. For example, the following class will
work fine:

.. code-block::

   #include "pw_metric/metric.h"

   class PowerSubsystem {
    public:
      Group& metrics() { return metrics_; }
      const Group& metrics() const { return metrics_; }

    private:
     PW_METRIC_GROUP(metrics_, "power");  // Note metrics_ declared first.
     PW_METRIC(metrics_, foo, "foo", 0.2f);
     PW_METRIC(metrics_, bar, "bar", 44000u);
   };

but the following one will not since the group is constructed after the metrics
(and will result in a compile error):

.. code-block::

   #include "pw_metric/metric.h"

   class PowerSubsystem {
    public:
      Group& metrics() { return metrics_; }
      const Group& metrics() const { return metrics_; }

    private:
     PW_METRIC(metrics_, foo, "foo", 0.2f);
     PW_METRIC(metrics_, bar, "bar", 44000u);
     PW_METRIC_GROUP(metrics_, "power");  // Error: metrics_ must be first.
   };

.. attention::

   Put **groups before metrics** when declaring metrics members inside classes.

Thread safety
-------------
``pw_metric`` has **no built-in synchronization for manipulating the tree**
structure. Users are expected to either rely on shared global mutex when
constructing the metric tree, or do the metric construction in a single thread
(e.g. a boot/init thread). The same applies for destruction, though we do not
advise destructing metrics or groups.

Individual metrics have atomic ``Increment()``, ``Set()``, and the value
accessors ``as_float()`` and ``as_int()`` which don't require separate
synchronization, and can be used from ISRs.

.. attention::

   **You must synchronize access to metrics**. ``pw_metrics`` does not
   internally synchronize access during construction. Metric Set/Increment are
   safe.

Lifecycle
---------
Metric objects are not designed to be destructed, and are expected to live for
the lifetime of the program or application. If you need dynamic
creation/destruction of metrics, ``pw_metric`` does not attempt to cover that
use case. Instead, ``pw_metric`` covers the case of products with two execution
phases:

1. A boot phase where the metric tree is created.
2. A run phase where metrics are collected. The tree structure is fixed.

Technically, it is possible to destruct metrics provided care is taken to
remove the given metric (or group) from the list it's contained in. However,
there are no helper functions for this, so be careful.

Below is an example that **is incorrect**. Don't do what follows!

.. code-block::

   #include "pw_metric/metric.h"

   void main() {
     PW_METRIC_GROUP(root, "/");
     {
       // BAD! The metrics have a different lifetime than the group.
       PW_METRIC(root, temperature, "temperature_f", 72.3f);
       PW_METRIC(root, humidity, "humidity_relative_percent", 33.2f);
     }
     // OOPS! root now has a linked list that points to the destructed
     // "humidity" object.
   }

.. attention::
   **Don't destruct metrics**. Metrics are designed to be registered /
   structured upfront, then manipulated during a device's active phase. They do
   not support destruction.

-----------------
Exporting metrics
-----------------
Collecting metrics on a device is not useful without a mechanism to export
those metrics for analysis and debugging. ``pw_metric`` offers optional RPC
service libraries (``:metric_service_nanopb`` based on nanopb, and
``:metric_service_pwpb`` based on pw_protobuf) that enable exporting a
user-supplied set of on-device metrics via RPC. This facility is intended to
function from the early stages of device bringup through production in the
field.

The metrics are fetched by calling the ``MetricService.Get`` RPC method, which
streams all registered metrics to the caller in batches (server streaming RPC).
Batching the returned metrics avoids requiring a large buffer or large RPC MTU.

The returned metric objects have flattened paths to the root. For example, the
returned metrics (post detokenization and jsonified) might look something like:

.. code-block:: none

   {
     "/i2c1/failed_txns": 17,
     "/i2c1/total_txns": 2013,
     "/i2c1/gyro/resets": 24,
     "/i2c1/gyro/hangs": 1,
     "/spi1/thermocouple/reads": 242,
     "/spi1/thermocouple/temp_celsius": 34.52,
   }

Note that there is no nesting of the groups; the nesting is implied from the
path.

RPC service setup
-----------------
To expose a ``MetricService`` in your application, do the following:

1. Define metrics around the system, and put them in a group or list of
   metrics. Easy choices include for example the ``global_groups`` and
   ``global_metrics`` variables; or creat your own.
2. Create an instance of ``pw::metric::MetricService``.
3. Register the service with your RPC server.

For example:

.. code-block::

   #include "pw_rpc/server.h"
   #include "pw_metric/metric.h"
   #include "pw_metric/global.h"
   #include "pw_metric/metric_service_nanopb.h"

   // Note: You must customize the RPC server setup; see pw_rpc.
   Channel channels[] = {
    Channel::Create<1>(&uart_output),
   };
   Server server(channels);

   // Metric service instance, pointing to the global metric objects.
   // This could also point to custom per-product or application objects.
   pw::metric::MetricService metric_service(
       pw::metric::global_metrics,
       pw::metric::global_groups);

   void RegisterServices() {
     server.RegisterService(metric_service);
     // Register other services here.
   }

   void main() {
     // ... system initialization ...

     RegisterServices();

     // ... start your applcation ...
   }

.. attention::
   Take care when exporting metrics. Ensure **appropriate access control** is in
   place. In some cases it may make sense to entirely disable metrics export for
   production builds. Although reading metrics via RPC won't influence the
   device, in some cases the metrics could expose sensitive information if
   product owners are not careful.

.. attention::
   **MetricService::Get is a synchronous RPC method**

   Calls to is ``MetricService::Get`` are blocking and will send all metrics
   immediately, even though it is a server-streaming RPC. This will work fine if
   the device doesn't have too many metrics, or doesn't have concurrent RPCs
   like logging, but could be a problem in some cases.

   We plan to offer an async version where the application is responsible for
   pumping the metrics into the streaming response. This gives flow control to
   the application.

-----------
Size report
-----------
The below size report shows the cost in code and memory for a few examples of
metrics. This does not include the RPC service.

.. include:: metric_size_report

.. attention::
   At time of writing, **the above sizes show an unexpectedly large flash
   impact**. We are investigating why GCC is inserting large global static
   constructors per group, when all the logic should be reused across objects.

-------------
Metric Parser
-------------
The metric_parser Python Module requests the system metrics via RPC, then parses the
response while detokenizing the group and metrics names, and returns the metrics
in a dictionary organized by group and value.

----------------
Design tradeoffs
----------------
There are many possible approaches to metrics collection and aggregation. We've
chosen some points on the tradeoff curve:

- **Atomic-sized metrics** - Using simple metric objects with just uint32/float
  enables atomic operations. While it might be nice to support larger types, it
  is more useful to have safe metrics increment from interrupt subroutines.

- **No aggregate metrics (yet)** - Aggregate metrics (e.g. average, max, min,
  histograms) are not supported, and must be built on top of the simple base
  metrics. By taking this route, we can considerably simplify the core metrics
  system and have aggregation logic in separate modules. Those modules can then
  feed into the metrics system - for example by creating multiple metrics for a
  single underlying metric. For example: "foo", "foo_max", "foo_min" and so on.

  The other problem with automatic aggregation is that what period the
  aggregation happens over is often important, and it can be hard to design
  this cleanly into the API. Instead, this responsibility is pushed to the user
  who must take more care.

  Note that we will add helpers for aggregated metrics.

- **No virtual metrics** - An alternate approach to the concrete Metric class
  in the current module is to have a virtual interface for metrics, and then
  allow those metrics to have their own storage. This is attractive but can
  lead to many vtables and excess memory use in simple one-metric use cases.

- **Linked list registration** - Using linked lists for registration is a
  tradeoff, accepting some memory overhead in exchange for flexibility. Other
  alternatives include a global table of metrics, which has the disadvantage of
  requiring centralizing the metrics -- an impossibility for middleware like
  Pigweed.

- **Synchronization** - The only synchronization guarantee provided by
  pw_metric is that increment and set are atomic. Other than that, users are on
  their own to synchonize metric collection and updating.

- **No fast metric lookup** - The current design does not make it fast to
  lookup a metric at runtime; instead, one must run a linear search of the tree
  to find the matching metric. In most non-dynamic use cases, this is fine in
  practice, and saves having a more involved hash table. Metric updates will be
  through direct member or variable accesses.

- **Relying on C++ static initialization** - In short, the convenience
  outweighs the cost and risk. Without static initializers, it would be
  impossible to automatically collect the metrics without post-processing the
  C++ code to find the metrics; a huge and debatably worthwhile approach. We
  have carefully analyzed the static initializer behaviour of Pigweed's
  IntrusiveList and are confident it is correct.

- **Both local & global support** - Potentially just one approach (the local or
  global one) could be offered, making the module less complex. However, we
  feel the additional complexity is worthwhile since there are legimitate use
  cases for both e.g. ``PW_METRIC`` and ``PW_METRIC_GLOBAL``. We'd prefer to
  have a well-tested upstream solution for these use cases rather than have
  customers re-implement one of these.

----------------
Roadmap & Status
----------------
- **String metric names** - ``pw_metric`` stores metric names as tokens. On one
  hand, this is great for production where having a compact binary is often a
  requirement to fit the application in the given part. However, in early
  development before flash is a constraint, string names are more convenient to
  work with since there is no need for host-side detokenization. We plan to add
  optional support for using supporting strings.

- **Aggregate metrics** - We plan to add support for aggregate metrics on top
  of the simple metric mechanism, either as another module or as additional
  functionality inside this one. Likely examples include min/max,

- **Selectively enable or disable metrics** - Currently the metrics are always
  enabled once included. In practice this is not ideal since many times only a
  few metrics are wanted in production, but having to strip all the metrics
  code is error prone. Instead, we will add support for controlling what
  metrics are enabled or disabled at compile time. This may rely on of C++20's
  support for zero-sized members to fully remove the cost.

- **Async RPC** - The current RPC service exports the metrics by streaming
  them to the client in batches. However, the current solution streams all the
  metrics to completion; this may block the RPC thread. In the future we will
  have an async solution where the user is in control of flow priority.

- **Timer integration** - We would like to add a stopwatch type mechanism to
  time multiple in-flight events.

- **C support** - In practice it's often useful or necessary to instrument
  C-only code. While it will be impossible to support the global registration
  system that the C++ version supports, we will figure out a solution to make
  instrumenting C code relatively smooth.

- **Global counter** - We may add a global metric counter to help detect cases
  where post-initialization metrics manipulations are done.

- **Proto structure** - It may be possible to directly map metrics to a custom
  proto structure, where instead of a name or token field, a tag field is
  provided. This could result in elegant export to an easily machine parsable
  and compact representation on the host. We may investigate this in the
  future.

- **Safer data structures** - At a cost of 4B per metric and 4B per group, it
  may be possible to make metric structure instantiation safe even in static
  constructors, and also make it safe to remove metrics dynamically. We will
  consider whether this tradeoff is the right one, since a 4B cost per metric
  is substantial on projects with many metrics.