Fix the Ord instance for Ident + some other small fixes #42
Open
yav wants to merge 55 commits intoharpocrates:masterfrom
Open
Fix the Ord instance for Ident + some other small fixes #42yav wants to merge 55 commits intoharpocrates:masterfrom
Ord instance for Ident + some other small fixes #42yav wants to merge 55 commits intoharpocrates:masterfrom
Conversation
In Rust 1.37, `catch`, `yield`, and `dyn` ceased to be "weak" keywords.
This simplifies the grammar a bit \o/.
`do catch { ... }` has turned into `try { ... }` and `async`-prefixed
blocks are now a thing.
Also re-jiggered the arguments of the `Closure` constructor to match that of the Rust AST.
This uses the new `await` keyword.
`TupleStructP`, `TupleP`, `SliceP`, all used to accept `..` in them. The parsing rules were awful, and the fields were confusing. Now, `..` is its own pattern (although semantically it doesn't make sense outside of the cases I just listed). Also, a `ParenP` was added. I have not fixed `Resolve` yet to take advantage of this.
Rust got full-blown support for or-patterns (see [RFC 2535][0]). This
means a couple changes:
* `OrP` is a new variant of `Pat`
* `WhileLet`, `IfLet`, `Arm` now just take a `Pat` (instead of a list)
* in the parser, or-patterns are not allowed everywhere that regular
patterns are!
Tests cases were heavily inspired by [the PR that implemented the RFC][1].
[0]: https://github.com/rust-lang/rfcs/blob/master/text/2535-or-patterns.md#grammar
[1]: rust-lang/rust#63693
Whenever a context doesn't support or-patterns out of the box, the trick is to add an extra set of parentheses around the pattern.
* Updated incorrect or misformatted Haddock docstrings * Removed trailing spaces from files * Updated the copyright year * Fixed up the `.cabal` file (more warnings enabled, tested-with)
`Fn` and `MethodSig` moved constness, safety, abi, and now asyncness into a new type called `FnHeader`. We do the same. I've also started fixing the `rustc-tests`, but they still don't pass.
Variety of fixes to the JSON expected, but also a good set of improvements to the error messages for failed `rustc-tests` cases * stack traces * suggestions for keys * array bounds (on out of bound)
* We are gradually moving over the new AST of `Generics`. This
commit, we cleaned up the naming of generic bounds
* Fix a nasty amibguity around `async {}` statements. The issue is the
same as `unsafe {}` statements - the parser can't know soon
enough whether it is dealing with the beginning of a function
definition or the beginning of an expression.
* Fix final outstanding `rustc-tests` TODO's introduced during the
1.37 bump
* Finally introduce `GenericParam`:
- `PathParameters` renamed to `GenericArgs`
- `LifetimeDef` -> `LifetimeParam` variant of `GenericParam`
- `TyParam` -> `TypeParam` variant of `GenericParam`
* `MethodCall` takes a path segment (although the parsing is still
very incomplete in this area)
* `rustc-tests` on the committed `sample-sources` works!!!
The existing parsing, printing, and test code has been adjusted, but no work was done to support the new constructor for bound constraints.
The restriction that statement items should only have inherited or public visibility has been lifted (although I'm not sure what the visibility means at all...).
* no longer require an ordering on entries of generics * parse/print/resolve const arguments and parameters * parse/print/resolve bound constraints * amended `rustc-tests` to work with the new constructs
These now build and run properly
* Proper handling of single semicolon statements (and diffing)
* `rustc-tests` now also check the exit code of `rustc` (instead of
just checking whether its stdout is valid JSON)
* Fix some incorrect handling of types with pluses
* Support parsing macro definitions (that use the `macro` keyword)
* Add attributes to fields and field patterns
* Rework pretty-printing of path types to be.. prettier... in
multi-line mode.
With these fixes almost all scraped files pass the `rustc-tests` tests
* general (possible unnamed) function arguments are now only allowed
in bare function types
* macro items in braces can have a trailing `;`
* fix some pretty-printing issues
I scraped 4400 more test cases from Rust's testsuite. With the
following fixes, only about 44 of the tests still fail.
* some keywords were unreserved
* trailing plus on bare trait objects
* new ABIs
* exclusive range patterns
* bug around parsing of `if break { }`/`if yield { }`/`if return { }`
* underscore crate import `extern crate foo as _`
* initializer expressions are allowed on any enum variant
* lexer is more permissive around accepted whitespace
* lexer allows underscores in character literals
* properly lex `/**/` as a comment
* normalize windows newlines in inline-style comments
* test for `OpaqueTy`, `OrP` in difference tests
Previously, only expressions could (and had to) put a `::` discriminator between identifiers and generics (so as to disambiguate with the less than operator). Now, type paths can do this too (although they do not _have_ to). The parsing paths for type and expression paths are now much similar. Also fixed a bounds issue on trait aliases (`trait Foo = ?Send` is now allowed).
This is motivated by two useful features I've been manually patching
into the testsuite for some time:
* pointing the testsuite at a _different_ folder of sources
* automatically deleting a source test case if `rustc` can't initially
parse it
* `static || { 1 };`, `async || { 1 };`, `async { 1 }`, `unsafe { 1 }`
and company finally parse as statements! Along the way, I refactored
and commented heavily the statement/expression-conflict-motivated
rules.
* `union::a + 1;`, `auto { x: 1 }`, and company also parse as statements!
* `ItemMac` no longer takes an optional identifier - the _only_ valid
form is `macro_rules! foo { ... }`. The grammar also reflects this.
* abstract out some duplicate parsing code for lambda expressions,
accept lambda expressions in more positions (esp. those with an
explicit result type).
* fix associativity of comparision operators
* where bound predicates parse empty bound lists
* invalid suffixes lead to parse errors, not crashes * replace `sep_by1T` with `sep_byT` where possible * allow `const _: <ty> = ...` * add support foreign macros * parse where clauses on trait aliases
* support attributes on expressions inside of a `let` * support self crate renamings (`extern crate self as foo`) * take into account the crate root in the `QSelf` index
* Edge case for parsing: `macro_rules` can be the name of a user defined
macro, and can be called manually. Example: `macro_rules!("my call!")`.
Parsing this is a bit more tricky though, due to the old style of macro
definitions: `macro_rules! my_macro { ... }`.
* Also added a top-level entry point into the path parser. Type paths are
now strictly more general than all other paths, so it makes sense to
use them as "general" paths.
* Allow bare trait objects to start with lifetimes
Block expressions can be broken out of using `break 'lbl <expr?>`. However, this requires blocks to be labelled. This commit adds * required AST changes for labelled block expressions * parsing of labelled block expressions * printing/resolving of labelled block expressions * adjusting all of the test cases and adding a couple new ones
* parsing * printing/resolving * `rustc-tests`
Allow failures on nightly.
Tests can now build and pass on GHC 8.8
Previously the instance was incorrect because it'd cause an infinite loop. This version rearranges the fields of the records to ensure that the hash field is first, which makes it possible to derive Eq and Ord. We also do a bunch of refactoring to use record notation instead of constructor pattern matching, to make it easier to do similar refactoring in the future.
This reverts commit 9ff9176. Per the discussion in #6, having the `Eq` and `Ord` instances ignore the `raw` field of `Ident` causes more trouble than it's worth, as it causes the parser to incorrectly deem raw identifiers like `r#return` to be keywords. While we could fix this issue by changing the parser, this would take quite a bit of code changes to accomplish. As such, we revert the change here, and we make a note in the Haddocks for the `Eq` and `Ord` instances to beware of the fact that `raw` is taken into account. After this change, the `rustc-tests` test suite passes once more. As such, this change fixes #6.
Make tests pass, migrate to GitHub Actions
The previous lexer implementation in `Language.Rust.Parser.Lexer` was broken for Unicode characters with sufficiently large codepoints, as the previous implementation incorrectly attempted to port UTF-16–encoded codepoints over to `alex`, which is UTF-8–encoded. Rather than try to fix the previous implementation (which was based on old `rustc` code that is no longer used), this ports the lexer to a new implementation that is based on the Rust `unicode-xid` crate (which is how modern versions of `rustc` lex Unicode characters). Specifically: * This adapts `unicode-xid`'s lexer generation script to generate an `alex`-based lexer instead of a Rust-based one. * The new lexer is generated to support codepoints from Unicode 15.1.0. (It is unclear which exact Unicode version the previous lexer targeted, but given that it was last updated in 2016, it was likely quite an old version.) * I have verified that the new lexer can lex exotic Unicode characters such as `𝑂` and `𐌝` by adding them as regression tests. Fixes #3.
Lexer: Properly support Unicode 15.1.0
…-2.1 Restrict `happy` version to less then 2.1
`happy-2.1.1` includes a fix for haskell/happy#320, which was preventing `language-rust` from building. Now that this version of `happy` is on Hackage, we no longer need to include such a restrictive upper version bound on `happy`.
Allow building with `happy-2.1.1` or later
Upstream sync
Remove one expression that is syntactically invalid, and uncomment another expression that _is_ valid (with some minor tweaks).
Some documentation is better than no documentation.
Address leftover review comments from #12
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Previously the instance was incorrect because it'd cause an infinite loop.
This version rearranges the fields of the records to ensure that the
hash field is first, which makes it possible to derive Eq and Ord.
We also do a bunch of refactoring to use record notation instead of
constructor pattern matching, to make it easier to do similar refactoring
in the future.
Other fixes: updates to make tests work, updates to make things work with more recent Aeson and Prettyprinter