aboutsummaryrefslogtreecommitdiff
path: root/gopls/internal/lsp/protocol/generate/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'gopls/internal/lsp/protocol/generate/README.md')
-rw-r--r--gopls/internal/lsp/protocol/generate/README.md136
1 files changed, 136 insertions, 0 deletions
diff --git a/gopls/internal/lsp/protocol/generate/README.md b/gopls/internal/lsp/protocol/generate/README.md
new file mode 100644
index 000000000..c8047f32b
--- /dev/null
+++ b/gopls/internal/lsp/protocol/generate/README.md
@@ -0,0 +1,136 @@
+# LSP Support for gopls
+
+## The protocol
+
+The LSP protocol exchanges json-encoded messages between the client and the server.
+(gopls is the server.) The messages are either Requests, which require Responses, or
+Notifications, which generate no response. Each Request or Notification has a method name
+such as "textDocument/hover" that indicates its meaning and determines which function in the server will handle it.
+The protocol is described in a
+[web page](https://microsoft.github.io/language-server-protocol/specifications/lsp/3.18/specification/),
+in words, and in a json file (metaModel.json) available either linked towards the bottom of the
+web page, or in the vscode-languageserver-node repository. This code uses the latter so the
+exact version can be tied to a githash. By default, the command will download the `github.com/microsoft/vscode-languageserver-node` repository to a temporary directory.
+
+The specification has five sections
+1. Requests, which describe the Request and Response types for request methods (e.g., *textDocument/didChange*),
+2. Notifications, which describe the Request types for notification methods,
+3. Structures, which describe named struct-like types,
+4. TypeAliases, which describe type aliases,
+5. Enumerations, which describe named constants.
+
+Requests and Notifications are tagged with a Method (e.g., `"textDocument/hover"`).
+The specification does not specify the names of the functions that handle the messages. These
+names are specified by the `methodNames` map. Enumerations generate Go `const`s, but
+in Typescript they are scoped to namespaces, while in Go they are scoped to a package, so the Go names
+may need to be modified to avoid name collisions. (See the `disambiguate` map, and its use.)
+
+Finally, the specified types are Typescript types, which are quite different from Go types.
+
+### Optionality
+The specification can mark fields in structs as Optional. The client distinguishes between missing
+fields and `null` fields in some cases. The Go translation for an optional type
+should be making sure the field's value
+can be `nil`, and adding the json tag `,omitempty`. The former condition would be satisfied by
+adding `*` to the field's type if the type is not a reference type.
+
+### Types
+The specification uses a number of different types, only a few of which correspond directly to Go types.
+The specification's types are "base", "reference", "map", "literal", "stringLiteral", "tuple", "and", "or".
+The "base" types correspond directly to Go types, although some Go types needs to be chosen for `URI` and `DocumentUri`. (The "base" types`RegExp`, `BooleanLiteral`, `NumericLiteral` never occur.)
+
+"reference" types are the struct-like types in the Structures section of the specification. The given
+names are suitable for Go to use, except the code needs to change names like `_Initialze` to `XInitialize` so
+they are exported for json marshaling and unmarshaling.
+
+"map" types are just like Go. (The key type in all of them is `DocumentUri`.)
+
+"stringLiteral" types are types whose type name and value are a single string. The chosen Go equivalent
+is to make the type `string` and the value a constant. (The alternative would be to generate a new
+named type, which seemed redundant.)
+
+"literal" types are like Go anonymous structs, so they have to be given a name. (All instances
+of the remaining types have to be given names. One approach is to construct the name from the components
+of the type, but this leads to misleading punning, and is unstable if components are added. The other approach
+is to construct the name from the context of the definition, that is, from the types it is defined within.
+For instance `Lit__InitializeParams_clientInfo` is the "literal" type at the
+`clientInfo` field in the `_InitializeParams`
+struct. Although this choice is sensitive to the ordering of the components, the code uses this approach,
+presuming that reordering components is an unlikely protocol change.)
+
+"tuple" types are generated as Go structs. (There is only one, with two `uint32` fields.)
+
+"and" types are Go structs with embedded type names. (There is only one, `And_Param_workspace_configuration`.)
+
+"or" types are the most complicated. There are a lot of them and there is no simple Go equivalent.
+They are defined as structs with a single `Value interface{}` field and custom json marshaling
+and unmarshaling code. Users can assign anything to `Value` but the type will be checked, and
+correctly marshaled, by the custom marshaling code. The unmarshaling code checks types, so `Value`
+will have one of the permitted types. (`nil` is always allowed.) There are about 40 "or" types that
+have a single non-null component, and these are converted to the component type.
+
+## Processing
+The code parses the json specification file, and scans all the types. It assigns names, as described
+above, to the types that are unnamed in the specification, and constructs Go equivalents as required.
+(Most of this code is in typenames.go.)
+
+There are four output files. tsclient.go and tsserver.go contain the definition and implementation
+of the `protocol.Client` and `protocol.Server` types and the code that dispatches on the Method
+of the Request or Notification. tsjson.go contains the custom marshaling and unmarshaling code.
+And tsprotocol.go contains the type and const definitions.
+
+### Accommodating gopls
+As the code generates output, mostly in generateoutput.go and main.go,
+it makes adjustments so that no changes are required to the existing Go code.
+(Organizing the computation this way makes the code's structure simpler, but results in
+a lot of unused types.)
+There are three major classes of these adjustments, and leftover special cases.
+
+The first major
+adjustment is to change generated type names to the ones gopls expects. Some of these don't change the
+semantics of the type, just the name.
+But for historical reasons a lot of them replace "or" types by a single
+component of the type. (Until fairly recently Go only saw or used only one of components.)
+The `goplsType` map in tables.go controls this process.
+
+The second major adjustment is to the types of fields of structs, which is done using the
+`renameProp` map in tables.go.
+
+The third major adjustment handles optionality, controlling `*` and `,omitempty` placement when
+the default rules don't match what gopls is expecting. (The map is `goplsStar`, also in tables.go)
+(If the intermediate components in expressions of the form `A.B.C.S` were optional, the code would need
+a lot of useless checking for nils. Typescript has a language construct to avoid most checks.)
+
+Then there are some additional special cases. There are a few places with adjustments to avoid
+recursive types. For instance `LSPArray` is `[]LSPAny`, but `LSPAny` is an "or" type including `LSPArray`.
+The solution is to make `LSPAny` an `interface{}`. Another instance is `_InitializeParams.trace`
+whose type is an "or" of 3 stringLiterals, which just becomes a `string`.
+
+### Checking
+`TestAll(t *testing.T)` checks that there are no unexpected fields in the json specification.
+
+While the code is executing, it checks that all the entries in the maps in tables.go are used.
+It also checks that the entries in `renameProp` and `goplsStar` are not redundant.
+
+As a one-time check on the first release of this code, diff-ing the existing and generated tsclient.go
+and tsserver.go code results in only whitespace and comment diffs. The existing and generated
+tsprotocol.go differ in whitespace and comments, and in a substantial number of new type definitions
+that the older, more heuristic, code did not generate. (And the unused type `_InitializeParams` differs
+slightly between the new and the old, and is not worth fixing.)
+
+### Some history
+The original stub code was written by hand, but with the protocol under active development, that
+couldn't last. The web page existed before the json specification, but it lagged the implementation
+and was hard to process by machine. So the earlier version of the generating code was written in Typescript, and
+used the Typescript compiler's API to parse the protocol code in the repository.
+It then used a set of heuristics
+to pick out the elements of the protocol, and another set of overlapping heuristics to create the Go code.
+The output was functional, but idiosyncratic, and the code was fragile and barely maintainable.
+
+### The future
+Most of the adjustments using the maps in tables.go could be removed by making changes, mostly to names,
+in the gopls code. Using more "or" types in gopls requires more elaborate, but stereotyped, changes.
+But even without all the adjustments, making this its own module would face problems; a number of
+dependencies would have to be factored out. And, it is fragile. The custom unmarshaling code knows what
+types it expects. A design that return an 'any' on unexpected types would match the json
+'ignore unexpected values' philosophy better, but the the Go code would need extra checking.