arcsecond

v3.1.1
<img src="./logo.png">

Arcsecond

Arcsecond is a zero-dependency, Fantasy Land compliant JavaScript Parser Combinator library largely inspired by Haskell's Parsec.

The arcsecond-binary peer library includes parsers specifically for working with binary data.


  • A note on recursive grammars

  • Fantasy Land

  • Name

  • Release Notes

    Since version 2.0.0, the release notes track changes to arcsecond.

    Installation

    npm i arcsecond
    

    Tutorials

    The tutorials provide a practical introduction to many of the concepts in arcsecond, starting from the most basic foundations and working up to more complex topics.

    Usage

    You can use ES6 imports or CommonJS requires.

    const {char} = require('arcsecond');
    
    const parsingResult = char('a').fork(
      // The string to parse
      'abc123',
    
      // The error handler (you can also return from this function!)
      (error, parsingState) => {
        const e = new Error(error);
        e.parsingState = parsingState;
        throw e;
      },
    
      // The success handler
      (result, parsingState) => {
        console.log(`Result: ${result}`);
        return result;
      }
    );
    

    Running the examples

    git clone [email protected]:francisrstokes/arcsecond.git
    cd arcsecond
    npm i
    
    # json example
    node -r esm examples/json/json.js
    

    The examples are built as es6 modules, which means they need node to be launched with the -r esm require flag, which allows import and export statements to be used.

    API

    Methods

    .run

    .run is a method on every parser, which takes input (which may be a string, TypedArray, ArrayBuffer, or DataView), and returns the result of parsing the input using the parser.

    Example

    str('hello').run('hello')
    // -> {
    //      isError: false,
    //      result: "hello",
    //      index: 5,
    //      data: null
    //    }
    

    .fork

    The .fork method is similar to .run. It takes input (which may be a string, TypedArray, ArrayBuffer, or DataView), an error transforming function and a success transforming function, and parses the input. If parsing was successful, the result is transformed using the success transforming function and returned. If parsing was not successful, the result is transformed using the error transforming function and returned.

    Example

    str('hello').fork(
      'hello',
      (errorMsg, parsingState) => {
        console.log(errorMsg);
        console.log(parsingState);
        return "goodbye"
      },
      (result, parsingState) => {
        console.log(parsingState);
        return result;
      }
    );
    // [console.log] Object {isError: false, error: null, target: "hello", data: null, index: 5, …}
    // -> "hello"
    
    str('hello').fork(
      'farewell',
      (errorMsg, parsingState) => {
        console.log(errorMsg);
        console.log(parsingState);
        return "goodbye"
      },
      (result, parsingState) => {
        console.log(parsingState);
        return result;
      }
    );
    // [console.log] ParseError (position 0): Expecting string 'hello', got 'farew...'
    // [console.log] Object {isError: true, error: "ParseError (position 0): Expecting string 'hello',…", target: "farewell", data: null, index: 0, …}
    // "goodbye"
    

    .map

    .map takes a function and returns a parser does not consume input, but instead runs the provided function on the last matched value, and set that as the new last matched value. This method can be used to apply structure or transform the values as they are being parsed.

    Example

    const newParser = letters.map(x => ({
      matchType: 'string',
      value: x
    });
    
    newParser.run('hello world')
    // -> {
    //      isError: false,
    //      result: {
    //        matchType: "string",
    //        value: "hello"
    //      },
    //      index: 5,
    //      data: null
    //    }
    

    .chain

    .chain takes a function which recieves the last matched value and should return a parser. That parser is then used to parse the following input, forming a chain of parsers based on previous input. .chain is the fundamental way of creating contextual parsers.

    Example

    
    const lettersThenSpace = sequenceOf([
      letters,
      char(' ')
    ]).map(x => x[0]);
    
    const newParser = lettersThenSpace.chain(matchedValue => {
      switch (matchedValue) {
        case 'number': return digits;
    
        case 'string': return letters;
    
        case 'bracketed': return sequenceOf([
          char('('),
          letters,
          char(')')
        ]).map(values => values[1]);
    
        default: return fail('Unrecognised input type');
      }
    });
    
    newParser.run('string Hello')
    // -> {
    //      isError: false,
    //      result: "Hello",
    //      index: 12,
    //      data: null
    //    }
    
    newParser.run('number 42')
    // -> {
    //      isError: false,
    //      result: "42",
    //      index: 9,
    //      data: null
    //    }
    
    newParser.run('bracketed (arcsecond)')
    // -> {
    //      isError: false,
    //      result: "arcsecond",
    //      index: 21,
    //      data: null
    //    }
    
    newParser.run('nope nothing')
    // -> {
    //      isError: true,
    //      error: "Unrecognised input type",
    //      index: 5,
    //      data: null
    //    }
    

    .mapFromData

    .mapFromData is almost the same as .map, except the function which it is passed also has access to the internal state data, and can thus transform values based on this data.

    Example

    
    const parserWithData = withData(letters.mapFromData(({result, data}) => ({
      matchedValueWas: result,
      internalDataWas: data
    })));
    
    parserWithData(42).run('hello');
    // -> {
    //      isError: false,
    //      result: {
    //        matchedValueWas: "hello",
    //        internalDataWas: 42
    //      },
    //      index: 5,
    //      data: 42
    //    }
    

    .chainFromData

    .chainFromData is almost the same as .chain, except the function which it is passed also has access to the internal state data, and can choose how parsing continues based on this data.

    Example

    const lettersThenSpace = sequenceOf([
      letters,
      char(' ')
    ]).map(x => x[0]);
    
    const parser = withData(lettersThenSpace.chainFromData(({result, data}) => {
      if (data.bypassNormalApproach) {
        return digits;
      }
    
      return letters;
    }));
    
    parser({ bypassNormalApproach: false }).run('hello world');
    // -> {
    //      isError: false,
    //      result: "world",
    //      index: 11,
    //      data: { bypassNormalApproach: false }
    //    }
    
    parser({ bypassNormalApproach: true }).run('hello world');
    // -> {
    //      isError: true,
    //      error: "ParseError (position 6): Expecting digits",
    //      index: 6,
    //      data: { bypassNormalApproach: true }
    //    }
    

    .errorMap

    .errorMap is like .map but it transforms the error value. The function passed to .errorMap gets an object the current error message (error) , the index (index) that parsing stopped at, and the data (data) from this parsing session.

    Example

    const newParser = letters.errorMap(({error, index}) => `Old message was: [${error}] @ index ${index}`);
    
    newParser.run('1234')
    // -> {
    //      isError: true,
    //      error: "Old message was: [ParseError (position 0): Expecting letters] @ index 0",
    //      index: 0,
    //      data: null
    //    }
    

    .errorChain

    .errorChain is almost the same as .chain, except that it only runs if there is an error in the parsing state. This is a useful method when either trying to recover from errors, or for when a more specific error message should be constructed.

    Example

    
    const parser = digits.errorChain(({error, index, data}) => {
      console.log('Recovering...');
      return letters;
    });
    
    p.run('42');
    // -> {
    //      isError: false,
    //      result: "42",
    //      index: 2,
    //      data: null
    //    }
    
    p.run('hello');
    // [console.log] Recovering...
    // -> {
    //      isError: false,
    //      result: "hello",
    //      index: 5,
    //      data: null
    //    }
    
    s = parser.run('');
    // [console.log] Recovering...
    // -> {
    //     isError: true,
    //     error: "ParseError (position 0): Expecting letters",
    //     index: 0,
    //     data: null
    //   }
    

    Functions

    setData

    setData takes anything that should be set as the internal state data, and returns a parser that will perform that side effect when the parser is run. This does not consume any input. If parsing is currently in an errored state, then the data will not be set.

    Example

    const parser = coroutine(run=> {
      const name = run(letters);
    
      if (name === 'Jim') {
        run(setData('The name is Jim'));
      }
    
      return name;
    });
    
    parser.run('Jim');
    // -> {
    //      isError: false,
    //      result: "Jim",
    //      index: 3,
    //      data: "The name is Jim"
    //    }
    

    If dealing with any complex level of state - such as an object where individual keys will be updated or required, then it can be useful to create utility parsers to assist with updating the internal state data. One possible pattern that could be used is the reducer pattern, famed by redux:

    Example

    
    const createStateReducer = reducer => action => getData.chain(state => setData(reducer(state, action)));
    
    const updateCounterState = createStateReducer((state = 0, action) => {
      switch (action.type) {
        case 'INC': {
          return state + 1;
        }
        case 'DEC': {
          return state - 1;
        }
        case 'ADD': {
          return state + action.payload;
        }
        case 'RESET': {
          return 0;
        }
      }
    });
    
    const parser = coroutine(run=>{
      let count = run(updateCounterState({ type: 'RESET' }));
      console.log(count);
    
      run(updateCounterState({ type: 'INC' }));
      run(updateCounterState({ type: 'INC' }));
      run(updateCounterState({ type: 'DEC' }));
      count = run(updateCounterState({ type: 'INC' }));
      console.log(count);
    
      return run(updateCounterState({ type: 'ADD', payload: 10 }));
    });
    
    parser.run('Parser is not looking at the text!');
    // [console.log] 0
    // [console.log] 2
    // -> {
    //      isError: false,
    //      result: 12,
    //      index: 0,
    //      data: 12
    //    }
    

    withData

    withData takes a provided parser, and returns a function waiting for some state data to be set, and then returns a new parser. That parser, when run, ensures that the state data is set as the internal state data before the provided parser runs.

    Example

    const parserWithoutData = letters;
    const parser = withData(parserWithoutData);
    
    parser("hello world!").run('Jim');
    // -> {
    //      isError: false,
    //      result: "Jim",
    //      index: 3,
    //      data: "hello world!"
    //    }
    
    parserWithoutData.run('Jim');
    // -> {
    //      isError: false,
    //      result: "Jim",
    //      index: 3,
    //      data: null
    //    }
    

    mapData

    mapData takes a function that recieves and returns some state data, and transforms the internal state data using the function, without consuming any input.

    Example

    const parser = withData(mapData(s => s.toUpperCase()));
    
    parser("hello world!").run('Jim');
    // -> {
    //      isError: false,
    //      result: null,
    //      index: 0,
    //      data: "HELLO WORLD!"
    //    }
    

    getData

    getData is a parser that will always return what is contained in the internal state data, without consuming any input.

    Example

    const parser = withData(sequenceOf([
      letters,
      digits,
      getData
    ]));
    
    parser("hello world!").run('Jim1234');
    // -> {
    //      isError: false,
    //      result: ["Jim", "1234", "hello world!"],
    //      index: 3,
    //      data: "hello world!"
    //    }
    

    If dealing with any complex level of state - such as an object where individual keys will be updated or required, then it can be useful to create utility parsers to assist.

    Example

    
    const selectState = selectorFn => getData.map(selectorFn);
    
    const parser = withData(coroutine(run=> {
      // Here we can take or transform the state
      const occupation = run(selectState(({job}) => job));
      const initials = run(selectState(({firstName, lastName}) => `${firstName[0]}${lastName[0]}`));
    
      console.log(`${initials}: ${occupation}`);
    
      const first = run(letters);
      const second = run(digits);
    
      return `${second}${first}`;
    }));
    
    parser({
      firstName: "Francis",
      lastName: "Stokes",
      job: "Developer"
    }).run('Jim1234');
    // [console.log] FS: Developer
    // -> {
    //      isError: false,
    //      result: "1234Jim",
    //      index: 3,
    //      data: {
    //        firstName: "Francis",
    //        lastName: "Stokes",
    //        job: "Developer"
    //      }
    //    }
    

    coroutine

    coroutine takes a user provided parser function, to which is passed a run function. Within the parser function, the user can run other parsers, and get immediate access to their results.

    coroutine allows you to write parsers in a more imperative and sequential way - in much the same way async/await allows you to write code with promises in a more sequential way.

    Inside of the parser function, you can use all regular JavaScript language features, like loops, variable assignments, try/catch, and conditional statements. This makes it easy to write very powerful parsers using coroutine, but on the other side it can lead to less readable, more complex code.

    Debugging is also much easier, as breakpoints can be easily added, and values logged to the console after they have been parsed.

    Example

    const parser = coroutine(run => {
      // Capture some letters and assign them to a variable
      const name = run(letters);
    
      // Capture a space
      run(char(' '));
    
      const age = run(digits.map(Number));
    
      // Capture a space
      run(char(' '));
    
      if (age >= 18) {
        run(str('is an adult'));
      } else {
        run(str('is a child'));
      }
    
      return { name, age };
    });
    
    parser.run('Jim 19 is an adult');
    // -> {
    //      isError: false,
    //      result: { name: "Jim", age: 19 },
    //      index: 18,
    //      data: null
    //    }
    
    parser.run('Jim 17 is an adult');
    // -> {
    //      isError: true,
    //      error: "ParseError (position 7): Expecting string 'is a child', got 'is an adul...'",
    //      index: 7,
    //      data: null
    //    }
    

    char

    char takes a character and returns a parser that matches that character exactly one time.

    Example

    char ('h').run('hello')
    // -> {
    //      isError: false,
    //      result: "h",
    //      index: 1,
    //      data: null
    //    }
    

    anyChar

    anyChar matches exactly one utf-8 character.

    Example

    anyChar.run('a')
    // -> {
    //      isError: false,
    //      result: "a",
    //      index: 1,
    //      data: null
    //    }
    
    anyChar.run('😉')
    // -> {
    //      isError: false,
    //      result: "😉",
    //      index: 4,
    //      data: null
    //    }
    

    str

    str takes a string and returns a parser that matches that string exactly one time.

    Example

    str('hello').run('hello world')
    // -> {
    //      isError: false,
    //      result: "hello",
    //      index: 5,
    //      data: null
    //    }
    

    digit

    digit is a parser that matches exactly one numerical digit /[0-9]/.

    Example

    digit.run('99 bottles of beer on the wall')
    // -> {
    //      isError: false,
    //      result: "9",
    //      index: 1,
    //      data: null
    //    }
    

    digits

    digits is a parser that matches one or more numerical digit /[0-9]/.

    Example

    digits.run('99 bottles of beer on the wall')
    // -> {
    //      isError: false,
    //      result: "99",
    //      index: 2,
    //      data: null
    //    }
    

    letter

    letter is a parser that matches exactly one alphabetical letter /[a-zA-Z]/.

    Example

    letter.run('hello world')
    // -> {
    //      isError: false,
    //      result: "h",
    //      index: 1,
    //      data: null
    //    }
    

    letters

    letters is a parser that matches one or more alphabetical letter /[a-zA-Z]/.

    Example

    letters.run('hello world')
    // -> {
    //      isError: false,
    //      result: "hello",
    //      index: 5,
    //      data: null
    //    }
    

    whitespace

    whitespace is a parser that matches one or more whitespace characters.

    Example

    const newParser = sequenceOf ([
      str ('hello'),
      whitespace,
      str ('world')
    ]);
    
    newParser.run('hello           world')
    // -> {
    //      isError: false,
    //      result: [ "hello", "           ", "world" ],
    //      index: 21,
    //      data: null
    //    }
    
    newParser.run('helloworld')
    // -> {
    //      isError: true,
    //      error: "ParseError 'many1' (position 5): Expecting to match at least one value",
    //      index: 5,
    //      data: null
    //    }
    

    optionalWhitespace

    optionalWhitespace is a parser that matches zero or more whitespace characters.

    Example

    const newParser = sequenceOf ([
      str ('hello'),
      optionalWhitespace,
      str ('world')
    ]);
    
    newParser.run('hello           world')
    // -> {
    //      isError: false,
    //      result: [ "hello", "           ", "world" ],
    //      index: 21,
    //      data: null
    //    }
    
    newParser.run('helloworld')
    // -> {
    //      isError: false,
    //      result: [ "hello", "", "world" ],
    //      index: 10,
    //      data: null
    //    }
    

    peek

    peek matches exactly one numerical byte without consuming any input.

    Example

    peek.run('hello world')
    // -> {
    //      isError: false,
    //      result: 104,
    //      index: 0,
    //      data: null
    //    }
    
    sequenceOf([
      str('hello'),
      peek
    ]).run('hello world')
    // -> {
    //      isError: false,
    //      result: [ "hello", 32 ],
    //      index: 5,
    //      data: null
    //    }
    

    anyOfString

    anyOfString takes a string and returns a parser that matches exactly one character from that string.

    Example

    anyOfString('aeiou').run('unusual string')
    // -> {
    //      isError: false,
    //      result: "u",
    //      index: 1,
    //      data: null
    //    }
    

    regex

    regex takes a RegExp and returns a parser that matches as many characters as the RegExp matches.

    Example

    regex(/^[hH][aeiou].{2}o/).run('hello world')
    // -> {
    //      isError: false,
    //      result: "hello",
    //      index: 5,
    //      data: null
    //    }
    

    sequenceOf

    sequenceOf takes an array of parsers, and returns a new parser that matches each of them sequentially, collecting up the results into an array.

    Example

    const newParser = sequenceOf ([
      str ('he'),
      letters,
      char (' '),
      str ('world'),
    ])
    
    newParser.run('hello world')
    // -> {
    //      isError: false,
    //      result: [ "he", "llo", " ", "world" ],
    //      index: 11,
    //      data: null
    //    }
    

    namedSequenceOf

    namedSequenceOf takes an array of string/parser pairs, and returns a new parser that matches each of them sequentially, collecting up the results into an object where the key is the string in the pair.

    A pair is just an array in the form: [string, parser]

    Example

    const newParser = namedSequenceOf ([
      ['firstPart', str ('he')],
      ['secondPart', letters],
      ['thirdPart', char (' ')],
      ['forthPart', str ('world')],
    ])
    
    newParser.run('hello world')
    // -> {
    //      isError: false,
    //      result: {
    //        firstPart: "he",
    //        secondPart: "llo",
    //        thirdPart: " ",
    //        forthPart: "world"
    //      },
    //      index: 11,
    //      data: null
    //    }
    

    choice

    choice takes an array of parsers, and returns a new parser that tries to match each one of them sequentially, and returns the first match. If choice fails, then it returns the error message of the parser that matched the most from the string.

    Example

    const newParser = choice ([
      digit,
      char ('!'),
      str ('hello'),
      str ('pineapple')
    ])
    
    newParser.run('hello world')
    // -> {
    //      isError: false,
    //      result: "hello",
    //      index: 5,
    //      data: null
    //    }
    

    lookAhead

    lookAhead takes look ahead parser, and returns a new parser that matches using the look ahead parser, but without consuming input.

    Example

    const newParser = sequenceOf ([
      str ('hello '),
      lookAhead (str ('world')),
      str ('wor')
    ]);
    
    newParser.run('hello world')
    // -> {
    //      isError: false,
    //      result: [ "hello ", "world", "wor" ],
    //      index: 9,
    //      data: null
    //    }
    

    sepBy

    sepBy takes two parsers - a separator parser and a value parser - and returns a new parser that matches zero or more values from the value parser that are separated by values of the separator parser. Because it will match zero or more values, this parser will fail if a value is followed by a separator but NOT another value. If there's no value, the result will be an empty array, not failure.

    Example

    const newParser = sepBy (char (',')) (letters)
    
    newParser.run('some,comma,separated,words')
    // -> {
    //      isError: false,
    //      result: [ "some", "comma", "separated", "words" ],
    //      index: 26,
    //      data: null
    //    }
    
    newParser.run('')
    // -> {
    //      isError: false,
    //      result: [],
    //      index: 0,
    //      data: null
    //    }
    
    newParser.run('12345')
    // -> {
    //      isError: false,
    //      result: [],
    //      index: 0,
    //      data: null
    //    }
    

    sepBy1

    sepBy1 is the same as sepBy, except that it matches one or more occurence.

    Example

    const newParser = sepBy1 (char (',')) (letters)
    
    newParser.run('some,comma,separated,words')
    // -> {
    //      isError: false,
    //      result: [ "some", "comma", "separated", "words" ],
    //      index: 26,
    //      data: null
    //    }
    
    newParser.run('1,2,3')
    // -> {
    //      isError: true,
    //      error: "ParseError 'sepBy1' (position 0): Expecting to match at least one separated value",
    //      index: 0,
    //      data: null
    //    }
    

    exactly

    exactly takes a positive number and returns a function. That function takes a parser and returns a new parser which matches the given parser the specified number of times.

    Example

    const newParser = exactly (4)(letter)
    
    newParser.run('abcdef')
    // -> {
    //      isError: false,
    //      result: [ "a", "b", "c", "d" ],
    //      index: 4,
    //      data: null
    //    }
    
    newParser.run('abc')
    // -> {
    //      isError: true,
    //      error: 'ParseError (position 0): Expecting 4 letter, but got end of input.',
    //      index: 0,
    //      data: null
    //    }
    
    newParser.run('12345')
    // -> {
    //      isError: true,
    //      error: 'ParseError (position 0): Expecting 4 letter, got '1'',
    //      index: 0,
    //      data: null
    //    }
    

    many

    many takes a parser and returns a new parser which matches that parser zero or more times. Because it will match zero or more values, this parser will always match, resulting in an empty array in the zero case.

    Example

    const newParser = many (str ('abc'))
    
    newParser.run('abcabcabcabc')
    // -> {
    //      isError: false,
    //      result: [ "abc", "abc", "abc", "abc" ],
    //      index: 12,
    //      data: null
    //    }
    
    newParser.run('')
    // -> {
    //      isError: false,
    //      result: [],
    //      index: 0,
    //      data: null
    //    }
    
    newParser.run('12345')
    // -> {
    //      isError: false,
    //      result: [],
    //      index: 0,
    //      data: null
    //    }
    

    many1

    many1 is the same as many, except that it matches one or more occurence.

    Example

    const newParser = many1 (str ('abc'))
    
    newParser.run('abcabcabcabc')
    // -> {
    //      isError: false,
    //      result: [ "abc", "abc", "abc", "abc" ],
    //      index: 12,
    //      data: null
    //    }
    
    newParser.run('')
    // -> {
    //   isError: true,
    //   error: "ParseError 'many1' (position 0): Expecting to match at least one value",
    //   index: 0,
    //   data: null
    // }
    
    newParser.run('12345')
    // -> {
    //   isError: true,
    //   error: "ParseError 'many1' (position 0): Expecting to match at least one value",
    //   index: 0,
    //   data: null
    // }
    

    between

    between takes 3 parsers, a left parser, a right parser, and a value parser, returning a new parser that matches a value matched by the value parser, between values matched by the left parser and the right parser.

    This parser can easily be partially applied with char ('(') and char (')') to create a betweenRoundBrackets parser, for example.

    Example

    const newParser = between (char ('<')) (char ('>')) (letters);
    
    newParser.run('<hello>')
    // -> {
    //      isError: false,
    //      result: "hello",
    //      index: 7,
    //      data: null
    //    }
    
    const betweenRoundBrackets = between (char ('(')) (char (')'));
    
    betweenRoundBrackets (many (letters)).run('(hello world)')
    // -> {
    //      isError: true,
    //      error: "ParseError (position 6): Expecting character ')', got ' '",
    //      index: 6,
    //      data: null
    //    }
    

    everythingUntil

    Note: Between 2.x and 3.x, the definition of the everythingUntil has changed. In 3.x, what was previously everythingUntil is now everyCharUntil.

    everythingUntil takes a termination parser and returns a new parser which matches every possible numerical byte up until a value is matched by the termination parser. When a value is matched by the termination parser, it is not "consumed".

    Example

    everythingUntil (char ('.')).run('This is a sentence.This is another sentence')
    // -> {
    //      isError: false,
    //      result: [84, 104, 105, 115, 32, 105, 115, 32, 97, 32, 115, 101, 110, 116, 101, 110, 99, 101],
    //      index: 18,
    //      data: null
    //    }
    
    // termination parser doesn't consume the termination value
    const newParser = sequenceOf ([
      everythingUntil (char ('.')),
      str ('This is another sentence')
    ]);
    
    newParser.run('This is a sentence.This is another sentence')
    // -> {
    //      isError: true,
    //      error: "ParseError (position 18): Expecting string 'This is another sentence', got '.This is another sentenc...'",
    //      index: 18,
    //      data: null
    //    }
    

    everyCharUntil

    everyCharUntil takes a termination parser and returns a new parser which matches every possible character up until a value is matched by the termination parser. When a value is matched by the termination parser, it is not "consumed".

    Example

    everyCharUntil (char ('.')).run('This is a sentence.This is another sentence')
    // -> {
    //      isError: false,
    //      result: 'This is a sentence',
    //      index: 18,
    //      data: null
    //    }
    
    // termination parser doesn't consume the termination value
    const newParser = sequenceOf ([
      everyCharUntil (char ('.')),
      str ('This is another sentence')
    ]);
    
    newParser.run('This is a sentence.This is another sentence')
    // -> {
    //      isError: true,
    //      error: "ParseError (position 18): Expecting string 'This is another sentence', got '.This is another sentenc...'",
    //      index: 18,
    //      data: null
    //    }
    

    anythingExcept

    Note: Between 2.x and 3.x, the definition of the anythingExcept has changed. In 3.x, what was previously anythingExcept is now anyCharExcept.

    anythingExcept takes a exception parser and returns a new parser which matches exactly one numerical byte, if it is not matched by the exception parser.

    Example

    anythingExcept (char ('.')).run('This is a sentence.')
    // -> {
    //   isError: false,
    //   result: 84,
    //   index: 1,
    //   data: null
    // }
    
    const manyExceptDot = many (anythingExcept (char ('.')))
    manyExceptDot.run('This is a sentence.')
    // -> {
    //      isError: false,
    //      result: [84, 104, 105, 115, 32, 105, 115, 32, 97, 32, 115, 101, 110, 116, 101, 110, 99, 101, 46],
    //      index: 18,
    //      data: null
    //    }
    

    anyCharExcept

    anyCharExcept takes a exception parser and returns a new parser which matches exactly one character, if it is not matched by the exception parser.

    Example

    anyCharExcept (char ('.')).run('This is a sentence.')
    // -> {
    //   isError: false,
    //   result: 'T',
    //   index: 1,
    //   data: null
    // }
    
    const manyExceptDot = many (anyCharExcept (char ('.')))
    manyExceptDot.run('This is a sentence.')
    // -> {
    //      isError: false,
    //      result: ['T', 'h', 'i', 's', ' ', 'i', 's', ' ', 'a', ' ', 's', 'e', 'n', 't', 'e', 'n', 'c', 'e'],
    //      index: 18,
    //      data: null
    //    }
    

    possibly

    possibly takes an attempt parser and returns a new parser which tries to match using the attempt parser. If it is unsuccessful, it returns a null value and does not "consume" any input.

    Example

    const newParser = sequenceOf ([
      possibly (str ('Not Here')),
      str ('Yep I am here')
    ]);
    
    newParser.run('Yep I am here')
    // -> {
    //      isError: false,
    //      result: [ null, "Yep I am here" ],
    //      index: 13,
    //      data: null
    //    }
    

    startOfInput

    startOfInput is a parser that only succeeds when the parser is at the beginning of the input.

    Example

    const mustBeginWithHeading = sequenceOf([
        startOfInput,
        str("# ")
      ]);
    const newParser = between(mustBeginWithHeading)(endOfInput)(everyCharUntil(endOfInput));
    
    newParser.run('# Heading');
    // -> {
    //      isError: false,
    //      result: "# Heading",
    //      index: 9,
    //      data: null
    //    }
    
    newParser.run(' # Heading');
    // -> {
    //      isError: true,
    //      error: "ParseError (position 0): Expecting string '# ', got ' #...'",
    //      index: 0,
    //      data: null
    //    }
    

    endOfInput

    endOfInput is a parser that only succeeds when there is no more input to be parsed.

    Example

    const newParser = sequenceOf ([
      str ('abc'),
      endOfInput
    ]);
    
    newParser.run('abc')
    // -> {
    //      isError: false,
    //      result: [ "abc", null ],
    //      index: 3,
    //      data: null
    //    }
    
    newParser.run('')
    // -> {
    //      isError: true,
    //      error: "ParseError (position 0): Expecting string 'abc', but got end of input.",
    //      index: 0,
    //      data: null
    //    }
    

    skip

    skip takes a skip parser and returns a new parser which matches using the skip parser, but doesn't return its value, but instead the value of whatever came before it.

    Example

    const newParser = pipeParsers ([
      str ('abc'),
      str('123'),
      skip (str ('def'))
    ])
    
    newParser.run('abc123def')
    // -> {
    //      isError: false,
    //      result: "123",
    //      index: 9,
    //      data: null
    //    }
    

    pipeParsers

    pipeParsers takes an array of parsers and composes them left to right, so each parsers return value is passed into the next one in the chain. The result is a new parser that, when run, yields the result of the final parser in the chain.

    Example

    const newParser = pipeParsers ([
      str ('hello'),
      char (' '),
      str ('world')
    ]);
    
    newParser.run('hello world')
    // -> {
    //      isError: false,
    //      result: "world",
    //      index: 11,
    //      data: null
    //    }
    

    composeParsers

    composeParsers takes an array of parsers and composes them right to left, so each parsers return value is passed into the next one in the chain. The result is a new parser that, when run, yields the result of the final parser in the chain.

    Example

    const newParser = composeParsers ([
      str ('world'),
      char (' '),
      str ('hello')
    ]);
    
    newParser.run('hello world')
    // -> {
    //      isError: false,
    //      result: "world",
    //      index: 11,
    //      data: null
    //    }
    

    takeRight

    takeRight takes two parsers, left and right, and returns a new parser that first matches the left, then the right, and keeps the value matched by the right.

    Example

    const newParser = takeRight (str ('hello ')) (str ('world'))
    
    newParser.run('hello world')
    // -> {
    //      isError: false,
    //      result: "world",
    //      index: 11,
    //      data: null
    //    }
    

    takeLeft

    takeLeft takes two parsers, left and right, and returns a new parser that first matches the left, then the right, and keeps the value matched by the left.

    Example

    const newParser = takeLeft (str ('hello ')) (str ('world'))
    
    newParser.run('hello world')
    // -> {
    //      isError: false,
    //      result: "hello",
    //      index: 11,
    //      data: null
    //    }
    

    recursiveParser

    recursiveParser takes a function that returns a parser (a thunk), and returns that same parser. This is needed in order to create recursive parsers because JavaScript is not a "lazy" language.

    In the following example both the value parser and the matchArray parser are defined in terms of each other, so one must be one must be defined using recursiveParser.

    Example

    const value = recursiveParser (() => choice ([
      matchNum,
      matchStr,
      matchArray
    ]));
    
    const betweenSquareBrackets = between (char ('[')) (char (']'));
    const commaSeparated = sepBy (char (','));
    const spaceSeparated = sepBy (char (' '));
    
    const matchNum = digits;
    const matchStr = letters;
    const matchArray = betweenSquareBrackets (commaSeparated (value));
    
    spaceSeparated(value).run('abc 123 [42,somethingelse] 45')
    // -> {
    //      isError: false,
    //      result: [ "abc", "123", [ "42", "somethingelse" ], "45" ],
    //      index: 29,
    //      data: null
    //    }
    

    tapParser

    tapParser takes a function and returns a parser that does nothing and consumes no input, but runs the provided function on the last parsed value. This is intended as a debugging tool to see the state of parsing at any point in a sequential operation like sequenceOf or pipeParsers.

    Example

    const newParser = sequenceOf ([
      letters,
      tapParser(console.log),
      char (' '),
      letters
    ]);
    
    newParser.run('hello world')
    // -> [console.log]: Object {isError: false, error: null, target: "hello world", data: null, index: 5, …}
    // -> {
    //      isError: false,
    //      result: [ "hello", "hello", " ", "world" ],
    //      index: 11,
    //      data: null
    //    }
    

    decide

    decide takes a function that recieves the last matched value and returns a new parser. It's important that the function always returns a parser. If a valid one cannot be selected, you can always use fail.

    decide allows an author to create a context-sensitive grammar.

    Example

    const newParser = sequenceOf ([
      takeLeft (letters) (char (' ')),
      decide (v => {
        switch (v) {
          case 'asLetters': return letters;
          case 'asDigits': return digits;
          default: return fail(`Unrecognised signifier '${v}'`);
        }
      })
    ]);
    
    newParser.run('asDigits 1234')
    // -> {
    //      isError: false,
    //      result: [ "asDigits", "1234" ],
    //      index: 13,
    //      data: null
    //    }
    
    newParser.run('asLetters hello')
    // -> {
    //      isError: false,
    //      result: [ "asLetters", "hello" ],
    //      index: 15,
    //      data: null
    //    }
    
    newParser.run('asPineapple wayoh')
    // -> {
    //      isError: true,
    //      error: "Unrecognised signifier 'asPineapple'",
    //      index: 12,
    //      data: null
    //    }
    

    mapTo

    mapTo takes a function and returns a parser does not consume input, but instead runs the provided function on the last matched value, and set that as the new last matched value. This function can be used to apply structure or transform the values as they are being parsed.

    Example

    const newParser = pipeParsers([
      letters,
      mapTo(x => {
        return {
          matchType: 'string',
          value: x
        }
      })
    ]);
    
    newParser.run('hello world')
    // -> {
    //      isError: false,
    //      result: {
    //        matchType: "string",
    //        value: "hello"
    //      },
    //      index: 5,
    //      data: null
    //    }
    

    errorMapTo

    errorMapTo is like mapTo but it transforms the error value. The function passed to errorMapTo gets the current error message as its first argument and the index that parsing stopped at as the second.

    Example

    const newParser = pipeParsers([
      letters,
      errorMapTo((message, index) => `Old message was: [${message}] @ index ${index}`)
    ]);
    
    newParser.run('1234')
    // -> {
    //      isError: true,
    //      error: "Old message was: [ParseError (position 0): Expecting letters] @ index 0",
    //      index: 0,
    //      data: null
    //    }
    

    fail

    fail takes an error message string and returns a parser that always fails with the provided error message.

    Example

    fail('Nope').run('hello world')
    // -> {
    //      isError: true,
    //      error: "Nope",
    //      index: 0,
    //      data: null
    //    }
    

    succeedWith

    succeedWith takes an value and returns a parser that always matches that value and does not consume any input.

    Example

    succeedWith ('anything').run('hello world')
    // -> {
    //      isError: false,
    //      result: "anything",
    //      data: null
    //      index: 0,
    //    }
    

    either

    either takes a parser and returns a parser that will always succeed, but the captured value will be an Either, indicating success or failure.

    Example

    either(fail('nope!')).run('hello world')
    // -> {
    //      isError: false,
    //      result: {
    //        isError: true,
    //        value: "nope!"
    //      },
    //      index: 0,
    //      data: null
    //    }
    

    toPromise

    toPromise converts a ParserResult (what is returned from .run) into a Promise.

    Example

    const parser = str('hello');
    
    toPromise(parser.run('hello world'))
      .then(console.log)
      .catch(({error, index, data}) => {
        console.log(error);
        console.log(index);
        console.log(data);
      });
    // -> [console.log] hello
    
    toPromise(parser.run('goodbye world'))
      .then(console.log)
      .catch(({error, index, data}) => {
        console.log('Error!');
        console.log(error);
        console.log(index);
        console.log(data);
      });
    // -> [console.log] Error!
    // -> [console.log] ParseError (position 0): Expecting string 'hello', got 'goodb...'
    // -> [console.log] 0
    // -> [console.log] null
    

    toValue

    toValue converts a ParserResult (what is returned from .run) into a regular value, and throws an error if the result contained one.

    Example

    const result = str ('hello').run('hello worbackgroiund<hAld');
    
    try {
      const value = toValue(result);
      console.log(value);
      // -> 'hello'
    } catch (parseError) {
      console.error(parseError.message)
    }
    

    parse

    parse takes a parser and input (which may be a string, TypedArray, ArrayBuffer, or DataView), and returns the result of parsing the input using the parser.

    Example

    parse (str ('hello')) ('hello')
    // -> {
    //      isError: false,
    //      result: "hello",
    //      index: 5,
    //      data: null
    //    }
    

    A note on recursive grammars

    If you're parsing a programming language, a configuration, or anything of sufficient complexity, it's likely that you'll need to define some parsers in terms of each other. You might want to do something like:

    const value = choice ([
      matchNum,
      matchStr,
      matchArray
    ]);
    
    const betweenSquareBrackets = between (char ('[')) (char (']'));
    const commaSeparated = sepBy (char (','));
    
    const matchNum = digits;
    const matchStr = letters;
    const matchArray = betweenSquareBrackets (commaSeparated (value));
    

    In this example, we are trying to define value in terms of matchArray, and matchArray in terms of value. This is problematic in a language like JavaScript because it is what's known as an "eager language". Because the definition of value is a function call to choice, the arguments of choice must be fully evaluated, and of course none of them are yet. If we just move the definition below matchNum, matchStr, and matchArray, we'll have the same problem with value not being defined before matchArray wants to use it.

    We can get around JavaScript's eagerness by using recursiveParser, which takes a function that returns a parser:

    const value = recursiveParser(() => choice ([
      matchNum,
      matchStr,
      matchArray
    ]));
    
    const betweenSquareBrackets = between (char ('[')) (char (']'));
    const commaSeparated = sepBy (char (','));
    
    const matchNum = digits;
    const matchStr = letters;
    const matchArray = betweenSquareBrackets (commaSeparated (value));
    

    Fantasy Land

    This library implements the following Fantasy Land (v3) interfaces:

    Every parser, or parser made from composing parsers has a .of, .map, .chain, and .ap method.

    Equivalent Operations

    of

    Parser.of(42)
    
    // is equivalent to
    
    succeedWith (42)
    

    map

    letters.map (fn)
    
    // is equivalent to
    
    pipeParsers ([ letters, mapTo (fn) ])
    

    chain

    letters.chain (x => someOtherParser)
    
    // is equivalent to
    
    pipeParsers ([ letters, decide (x => someOtherParser) ])
    

    ap

    letters.ap (Parser.of (fn))
    
    // is equivalent to
    
    pipeParsers ([
      sequenceOf ([ succeedWith (fn), letters ]),
      mapTo (([fn, x]) => fn(x))
    ]);
    

    Name

    The name is also derived from parsec, which in astronomical terms is an "astronomical unit [that] subtends an angle of one arcsecond".

    Metadata

    • MIT
    • Whatever
    • Francis Stokes
    • released 8/20/2020

    Downloads

    Maintainers