aboutsummaryrefslogtreecommitdiff
path: root/pw_presubmit/docs.rst
blob: 5be527bc905a58c881eab41d1ee03786c9f160a3 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
.. _module-pw_presubmit:

============
pw_presubmit
============
The presubmit module provides Python tools for running presubmit checks and
checking and fixing code format. It also includes the presubmit check script for
the Pigweed repository, ``pigweed_presubmit.py``.

Presubmit checks are essential tools, but they take work to set up, and
projects don’t always get around to it. The ``pw_presubmit`` module provides
tools for setting up high quality presubmit checks for any project. We use this
framework to run Pigweed’s presubmit on our workstations and in our automated
building tools.

The ``pw_presubmit`` module also includes ``pw format``, a tool that provides a
unified interface for automatically formatting code in a variety of languages.
With ``pw format``, you can format Bazel, C, C++, Python, GN, and Go code
according to configurations defined by your project. ``pw format`` leverages
existing tools like ``clang-format``, and it’s simple to add support for new
languages. (Note: Bazel formatting requires ``buildifier`` to be present on your
system. If it's not Bazel formatting passes without checking.)

.. image:: docs/pw_presubmit_demo.gif
   :alt: ``pw format`` demo
   :align: left

The ``pw_presubmit`` package includes presubmit checks that can be used with any
project. These checks include:

* Check code format of several languages including C, C++, and Python
* Initialize a Python environment
* Run all Python tests
* Run pylint
* Run mypy
* Ensure source files are included in the GN and Bazel builds
* Build and run all tests with GN
* Build and run all tests with Bazel
* Ensure all header files contain ``#pragma once``

-------------
Compatibility
-------------
Python 3

-------------------------------------------
Creating a presubmit check for your project
-------------------------------------------
Creating a presubmit check for a project using ``pw_presubmit`` is simple, but
requires some customization. Projects must define their own presubmit check
Python script that uses the ``pw_presubmit`` package.

A project's presubmit script can be registered as a
:ref:`pw_cli <module-pw_cli>` plugin, so that it can be run as ``pw
presubmit``.

Setting up the command-line interface
=====================================
The ``pw_presubmit.cli`` module sets up the command-line interface for a
presubmit script. This defines a standard set of arguments for invoking
presubmit checks. Its use is optional, but recommended.

pw_presubmit.cli
----------------
.. automodule:: pw_presubmit.cli
   :members: add_arguments, run

Presubmit output directory
--------------------------
The ``pw_presubmit`` command line interface includes an ``--output-directory``
option that specifies the working directory to use for presubmits. The default
path is ``out/presubmit``.  A subdirectory is created for each presubmit step.
This directory persists between presubmit runs and can be cleaned by deleting it
or running ``pw presubmit --clean``.

Presubmit checks
================
A presubmit check is defined as a function or other callable. The function must
accept one argument: a ``PresubmitContext``, which provides the paths on which
to run. Presubmit checks communicate failure by raising an exception.

Presubmit checks may use the ``filter_paths`` decorator to automatically filter
the paths list for file types they care about.

Either of these functions could be used as presubmit checks:

.. code-block:: python

  @pw_presubmit.filter_paths(endswith='.py')
  def file_contains_ni(ctx: PresubmitContext):
      for path in ctx.paths:
          with open(path) as file:
              contents = file.read()
              if 'ni' not in contents and 'nee' not in contents:
                  raise PresumitFailure('Files must say "ni"!', path=path)

  def run_the_build(_):
      subprocess.run(['make', 'release'], check=True)

Presubmit checks functions are grouped into "programs" -- a named series of
checks. Projects may find it helpful to have programs for different purposes,
such as a quick program for local use and a full program for automated use. The
:ref:`example script <example-script>` uses ``pw_presubmit.Programs`` to define
``quick`` and ``full`` programs.

``PresubmitContext`` has the following members:

* ``root``: Source checkout root directory
* ``repos``: Repositories (top-level and submodules) processed by
  ``pw presubmit``
* ``output_dir``: Output directory for this specific presubmit step
* ``failure_summary_log``: File path where steps should write a brief summary
  of any failures
* ``paths``: Modified files for the presubmit step to check (often used in
  formatting steps but ignored in compile steps)
* ``all_paths``: All files in the repository tree.
* ``package_root``: Root directory for ``pw package`` installations
* ``override_gn_args``: Additional GN args processed by ``build.gn_gen()``
* ``luci``: Information about the LUCI build or None if not running in LUCI
* ``num_jobs``: Number of jobs to run in parallel
* ``continue_after_build_error``: For steps that compile, don't exit on the
  first compilation error

The ``luci`` member is of type ``LuciContext`` and has the following members:

* ``buildbucket_id``: The globally-unique buildbucket id of the build
* ``build_number``: The builder-specific incrementing build number, if
  configured for this builder
* ``project``: The LUCI project under which this build is running (often
  ``pigweed`` or ``pigweed-internal``)
* ``bucket``: The LUCI bucket under which this build is running (often ends
  with ``ci`` or ``try``)
* ``builder``: The builder being run
* ``swarming_server``: The swarming server on which this build is running
* ``swarming_task_id``: The swarming task id of this build
* ``cas_instance``: The CAS instance accessible from this build
* ``pipeline``: Information about the build pipeline, if applicable.
* ``triggers``: Information about triggering commits, if applicable.

The ``pipeline`` member, if present, is of type ``LuciPipeline`` and has the
following members:

* ``round``: The zero-indexed round number.
* ``builds_from_previous_iteration``: A list of the buildbucket ids from the
  previous round, if any, encoded as strs.

The ``triggers`` member is a sequence of ``LuciTrigger`` objects, which have the
following members:

* ``number``: The number of the change in Gerrit.
* ``patchset``: The number of the patchset of the change.
* ``remote``: The full URL of the remote.
* ``branch``: The name of the branch on which this change is being/was
  submitted.
* ``ref``: The ``refs/changes/..`` path that can be used to reference the
  patch for unsubmitted changes and the hash for submitted changes.
* ``gerrit_name``: The name of the googlesource.com Gerrit host.
* ``submitted``: Whether the change has been submitted or is still pending.

Additional members can be added by subclassing ``PresubmitContext`` and
``Presubmit``. Then override ``Presubmit._create_presubmit_context()`` to
return the subclass of ``PresubmitContext``. Finally, add
``presubmit_class=PresubmitSubClass`` when calling ``cli.run()``.

Substeps
--------
Presubmit steps can define substeps that can run independently in other tooling.
These steps should subclass ``SubStepCheck`` and must define a ``substeps()``
method that yields ``SubStep`` objects. ``SubStep`` objects have the following
members:

* ``name``: Name of the substep
* ``_func``: Substep code
* ``args``: Positional arguments for ``_func``
* ``kwargs``: Keyword arguments for ``_func``

``SubStep`` objects must have unique names. For a detailed example of a
``SubStepCheck`` subclass see ``GnGenNinja`` in ``build.py``.

Existing Presubmit Checks
-------------------------
A small number of presubmit checks are made available through ``pw_presubmit``
modules.

Code Formatting
^^^^^^^^^^^^^^^
Formatting checks for a variety of languages are available from
``pw_presubmit.format_code``. These include C/C++, Java, Go, Python, GN, and
others. All of these checks can be included by adding
``pw_presubmit.format_code.presubmit_checks()`` to a presubmit program. These
all use language-specific formatters like clang-format or black.

These will suggest fixes using ``pw format --fix``.

Options for code formatting can be specified in the ``pigweed.json`` file
(see also :ref:`SEED-0101 <seed-0101>`). These apply to both ``pw presubmit``
steps that check code formatting and ``pw format`` commands that either check
or fix code formatting.

* ``python_formatter``: Choice of Python formatter. Options are ``black`` (used
  by Pigweed itself) and ``yapf`` (the default).
* ``black_path``: If ``python_formatter`` is ``black``, use this as the
  executable instead of ``black``.

.. TODO(b/264578594) Add exclude to pigweed.json file.
.. * ``exclude``: List of path regular expressions to ignore.

Example section from a ``pigweed.json`` file:

.. code-block::

  {
    "pw": {
      "pw_presubmit": {
        "format": {
          "python_formatter": "black",
          "black_path": "black"
        }
      }
    }
  }

Sorted Blocks
^^^^^^^^^^^^^
Blocks of code can be required to be kept in sorted order using comments like
the following:

.. code-block::

  # keep-sorted: start
  bar
  baz
  foo
  # keep-sorted: end

This can be included by adding ``pw_presubmit.keep_sorted.presubmit_check`` to a
presubmit program. Adding ``ignore-case`` to the start line will use
case-insensitive sorting.

By default, duplicates will be removed. Lines that are identical except in case
are preserved, even with ``ignore-case``. To allow duplicates, add
``allow-dupes`` to the start line.

Prefixes can be ignored by adding ``ignore-prefix=`` followed by a
comma-separated list of prefixes. The list below will be kept in this order.
Neither commas nor whitespace are supported in prefixes.

.. code-block::

  # keep-sorted: start ignore-prefix=',"
  'bar',
  "baz",
  'foo',
  # keep-sorted: end

Inline comments are assumed to be associated with the following line. For
example, the following is already sorted. This can be disabled with
``sticky-comments=no``.

.. todo-check: disable

.. code-block::

  # keep-sorted: start
  # TODO(b/1234) Fix this.
  bar,
  # TODO(b/5678) Also fix this.
  foo,
  # keep-sorted: end

.. todo-check: enable

By default, the prefix of the keep-sorted line is assumed to be the comment
marker used by any inline comments. This can be overridden by adding lines like
``sticky-comments=%,#`` to the start line.

Lines indented more than the preceding line are assumed to be continuations.
Thus, the following block is already sorted. keep-sorted blocks can not be
nested, so there's no ability to add a keep-sorted block for the sub-items.

.. code-block::

  # keep-sorted: start
  * abc
    * xyz
    * uvw
  * def
  # keep-sorted: end

The presubmit check will suggest fixes using ``pw keep-sorted --fix``.

Future versions may support additional multiline list items.

.gitmodules
^^^^^^^^^^^
Various rules can be applied to .gitmodules files. This check can be included
by adding ``pw_presubmit.gitmodules.create()`` to a presubmit program. This
function takes an optional argument of type ``pw_presubmit.gitmodules.Config``.
``Config`` objects have several properties.

* ``allow_non_googlesource_hosts: bool = False`` — If false, all submodules URLs
  must be on a Google-managed Gerrit server.
* ``allowed_googlesource_hosts: Sequence[str] = ()`` — If set, any
  Google-managed Gerrit URLs for submodules most be in this list. Entries
  should be like ``pigweed`` for ``pigweed-review.googlesource.com``.
* ``require_relative_urls: bool = False`` — If true, all submodules must be
  relative to the superproject remote.
* ``allow_sso: bool = True`` — If false, ``sso://`` and ``rpc://`` submodule
  URLs are prohibited.
* ``allow_git_corp_google_com: bool = True`` — If false, ``git.corp.google.com``
  submodule URLs are prohibited.
* ``require_branch: bool = False`` — If True, all submodules must reference a
  branch.
* ``validator: Callable[[PresubmitContext, Path, str, Dict[str, str]], None] = None``
  — A function that can be used for arbitrary submodule validation. It's called
  with the ``PresubmitContext``, the path to the ``.gitmodules`` file, the name
  of the current submodule, and the properties of the current submodule.

#pragma once
^^^^^^^^^^^^
There's a ``pragma_once`` check that confirms the first non-comment line of
C/C++ headers is ``#pragma once``. This is enabled by adding
``pw_presubmit.cpp_checks.pragma_once`` to a presubmit program.

.. todo-check: disable

TODO(b/###) Formatting
^^^^^^^^^^^^^^^^^^^^^^^^^
There's a check that confirms ``TODO`` lines match a given format. Upstream
Pigweed expects these to look like ``TODO(b/###): Explanation``, but makes it
easy for projects to define their own pattern instead.

To use this check add ``todo_check.create(todo_check.BUGS_OR_USERNAMES)`` to a
presubmit program.

.. todo-check: enable

Python Checks
^^^^^^^^^^^^^
There are two checks in the ``pw_presubmit.python_checks`` module, ``gn_pylint``
and ``gn_python_check``. They assume there's a top-level ``python`` GN target.
``gn_pylint`` runs Pylint and Mypy checks and ``gn_python_check`` runs Pylint,
Mypy, and all Python tests.

Inclusive Language
^^^^^^^^^^^^^^^^^^
.. inclusive-language: disable

The inclusive language check looks for words that are typical of non-inclusive
code, like using "master" and "slave" in place of "primary" and "secondary" or
"sanity check" in place of "consistency check".

.. inclusive-language: enable

These checks can be disabled for individual lines with
"inclusive-language: ignore" on the line in question or the line above it, or
for entire blocks by using "inclusive-language: disable" before the block and
"inclusive-language: enable" after the block.

.. In case things get moved around in the previous paragraphs the enable line
.. is repeated here: inclusive-language: enable.

OWNERS
^^^^^^
There's a check that requires folders matching specific patterns contain
``OWNERS`` files. It can be included by adding
``module_owners.presubmit_check()`` to a presubmit program. This function takes
a callable as an argument that indicates, for a given file, where a controlling
``OWNERS`` file should be, or returns None if no ``OWNERS`` file is necessary.
Formatting of ``OWNERS`` files is handled similary to formatting of other
source files and is discussed in `Code Formatting`.

Source in Build
^^^^^^^^^^^^^^^
Pigweed provides checks that source files are configured as part of the build
for GN, Bazel, and CMake. These can be included by adding
``source_in_build.gn(filter)`` and similar functions to a presubmit check. The
CMake check additionally requires a callable that invokes CMake with appropriate
options.

pw_presubmit
------------
.. automodule:: pw_presubmit
   :members: filter_paths, FileFilter, call, PresubmitFailure, Programs

.. _example-script:


Git hook
--------
You can run a presubmit program or step as a `git hook
<https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks>`_ using
``pw_presubmit.install_hook``.  This can be used to run certain presubmit
checks before a change is pushed to a remote.

We strongly recommend that you only run fast (< 15 seconds) and trivial checks
as push hooks, and perform slower or more complex ones in CI. This is because,

* Running slow checks in the push hook will force you to wait longer for
  ``git push`` to complete, and
* If your change fails one of the checks at this stage, it will not yet be
  uploaded to the remote, so you'll have a harder time debugging any failures
  (sharing the change with your colleagues, linking to it from an issue
  tracker, etc).

Example
=======
A simple example presubmit check script follows. This can be copied-and-pasted
to serve as a starting point for a project's presubmit check script.

See ``pigweed_presubmit.py`` for a more complex presubmit check script example.

.. code-block:: python

  """Example presubmit check script."""

  import argparse
  import logging
  import os
  from pathlib import Path
  import re
  import sys
  from typing import List, Optional, Pattern

  try:
      import pw_cli.log
  except ImportError:
      print('ERROR: Activate the environment before running presubmits!',
            file=sys.stderr)
      sys.exit(2)

  import pw_presubmit
  from pw_presubmit import (
      build,
      cli,
      cpp_checks,
      environment,
      format_code,
      git_repo,
      inclusive_language,
      filter_paths,
      python_checks,
      PresubmitContext,
  )
  from pw_presubmit.install_hook import install_hook

  # Set up variables for key project paths.
  PROJECT_ROOT = Path(os.environ['MY_PROJECT_ROOT'])
  PIGWEED_ROOT = PROJECT_ROOT / 'pigweed'

  # Rerun the build if files with these extensions change.
  _BUILD_EXTENSIONS = frozenset(
      ['.rst', '.gn', '.gni', *format_code.C_FORMAT.extensions])


  #
  # Presubmit checks
  #
  def release_build(ctx: PresubmitContext):
      build.gn_gen(ctx, build_type='release')
      build.ninja(ctx)


  def host_tests(ctx: PresubmitContext):
      build.gn_gen(ctx, run_host_tests='true')
      build.ninja(ctx)


  # Avoid running some checks on certain paths.
  PATH_EXCLUSIONS = (
      re.compile(r'^external/'),
      re.compile(r'^vendor/'),
  )


  # Use the upstream pragma_once check, but apply a different set of path
  # filters with @filter_paths.
  @filter_paths(endswith='.h', exclude=PATH_EXCLUSIONS)
  def pragma_once(ctx: PresubmitContext):
      cpp_checks.pragma_once(ctx)


  #
  # Presubmit check programs
  #
  OTHER = (
      # Checks not ran by default but that should be available. These might
      # include tests that are expensive to run or that don't yet pass.
      build.gn_quick_check,
  )

  QUICK = (
      # List some presubmit checks to run
      pragma_once,
      host_tests,
      # Use the upstream formatting checks, with custom path filters applied.
      format_code.presubmit_checks(exclude=PATH_EXCLUSIONS),
      # Include the upstream inclusive language check.
      inclusive_language.presubmit_check,
      # Include just the lint-related Python checks.
      python_checks.gn_pylint.with_filter(exclude=PATH_EXCLUSIONS),
  )

  FULL = (
      QUICK,  # Add all checks from the 'quick' program
      release_build,
      # Use the upstream Python checks, with custom path filters applied.
      # Checks listed multiple times are only run once.
      python_checks.gn_python_check.with_filter(exclude=PATH_EXCLUSIONS),
  )

  PROGRAMS = pw_presubmit.Programs(other=OTHER, quick=QUICK, full=FULL)


  #
  # Allowlist of remote refs for presubmit. If the remote ref being pushed to
  # matches any of these values (with regex matching), then the presubmits
  # checks will be run before pushing.
  #
  PRE_PUSH_REMOTE_REF_ALLOWLIST = (
      'refs/for/main',
  )


  def run(install: bool, remote_ref: Optional[str],  **presubmit_args) -> int:
      """Process the --install argument then invoke pw_presubmit."""

      # Install the presubmit Git pre-push hook, if requested.
      if install:
          # '$remote_ref' will be replaced by the actual value of the remote ref
          # at runtime.
          install_git_hook('pre-push', [
              'python', '-m', 'tools.presubmit_check', '--base', 'HEAD~',
              '--remote-ref', '$remote_ref'
          ])
          return 0

      # Run the checks if either no remote_ref was passed, or if the remote ref
      # matches anything in the allowlist.
      if remote_ref is None or any(
              re.search(pattern, remote_ref)
              for pattern in PRE_PUSH_REMOTE_REF_ALLOWLIST):
          return cli.run(root=PROJECT_ROOT, **presubmit_args)


  def main() -> int:
      """Run the presubmit checks for this repository."""
      parser = argparse.ArgumentParser(description=__doc__)
      cli.add_arguments(parser, PROGRAMS, 'quick')

      # Define an option for installing a Git pre-push hook for this script.
      parser.add_argument(
          '--install',
          action='store_true',
          help='Install the presubmit as a Git pre-push hook and exit.')

      # Define an optional flag to pass the remote ref into this script, if it
      # is run as a pre-push hook. The destination variable in the parsed args
      # will be `remote_ref`, as dashes are replaced with underscores to make
      # valid variable names.
      parser.add_argument(
          '--remote-ref',
          default=None,
          nargs='?',  # Make optional.
          help='Remote ref of the push command, for use by the pre-push hook.')

      return run(**vars(parser.parse_args()))

  if __name__ == '__main__':
      pw_cli.log.install(logging.INFO)
      sys.exit(main())

---------------------
Code formatting tools
---------------------
The ``pw_presubmit.format_code`` module formats supported source files using
external code format tools. The file ``format_code.py`` can be invoked directly
from the command line or from ``pw`` as ``pw format``.

Example
=======
A simple example of adding support for a custom format. This code wraps the
built in formatter to add a new format. It could also be used to replace
a formatter or remove/disable a PigWeed supplied one.

.. code-block:: python

  #!/usr/bin/env python
  """Formats files in repository. """

  import logging
  import sys

  import pw_cli.log
  from pw_presubmit import format_code
  from your_project import presubmit_checks
  from your_project import your_check

  YOUR_CODE_FORMAT = CodeFormat('YourFormat',
                                filter=FileFilter(suffix=('.your', )),
                                check=your_check.check,
                                fix=your_check.fix)

  CODE_FORMATS = (*format_code.CODE_FORMATS, YOUR_CODE_FORMAT)

  def _run(exclude, **kwargs) -> int:
      """Check and fix formatting for source files in the repo."""
      return format_code.format_paths_in_repo(exclude=exclude,
                                              code_formats=CODE_FORMATS,
                                              **kwargs)


  def main():
      return _run(**vars(format_code.arguments(git_paths=True).parse_args()))


  if __name__ == '__main__':
      pw_cli.log.install(logging.INFO)
      sys.exit(main())