Lesser evils
Aug. 11th, 2004 10:06 amIt seems to me that there should be a way to take advantage of the current Object-Oriented programming ideology to make a more efficient, or perhaps simply more usable, parser and lexical analyzer kit.
( Ranting about lack of correctly designed kits... oh wait, there's one )
Ah. Wait! Someone did, it looks like - just found a program called JavaCC which incorporates a lexer and a parser together.
Compilers are one of my favorite topics, largely because unlike several of the topics, they lurk _just_ out of my reach instead of miles over my head. The math involved is relatively sensible, unlike, say, 3D textures, which requires Digital Signal Processing knowledge, which requires much calculus. I hate calculus.
Compilation's basically a two-part problem. Lexical analysis and parsing. In lexing, you take a stream of characters and differentiate them into 'tokens'. In English, the tokens are words and punctuation. So output from the lexer for English (if one existed, which it doesn't) would be something like:
Once a string has been lexed, it's fed into a parser, which makes _sense_ of it. Parsers are made up of a set of rules called a grammar. Just like English, a grammar delineates what the legal ways to put together tokens are.
So, something like this would be the beginnings of an English grammar - I'll be nice and put it into english phrases rather than the standard way grammars are specified (It's called Backus-Naur Form or BNF, and it's actually pretty readable... for computer languages. AFAIK, you can't actually get English into BNF, quite. Too fucked up a grammar structure)
And so on.
Once the Parser has gathered certain collections of stuff, say, a PHRASE in the above example, it takes some kind of action. In computer compiler situations, this action is often translating the phrase into machine code. I'd guess that if a similar process were going on in your head, the result would be some kind of comprehension - "Ah, he's a fish. Wait, what?". Eventually, the whole string has been translated into whatever end product you wanted, and you're done parsing, save perhaps for some optimization. "Ah, he _thinks_ he's a fish, and wants fish food. I'll go into the other room and call the nice young men in their clean white coats. This guy's fish slipped off the hook."
um. Like you cared. Most of my quick toss-off projects quickly find themselves in need of customization, and a config file of some sort usually ends up being the answer. Which means that I run into parsing problems relatively often for someone who's not a compiler programmer.
*shrug*
( Ranting about lack of correctly designed kits... oh wait, there's one )
Ah. Wait! Someone did, it looks like - just found a program called JavaCC which incorporates a lexer and a parser together.
Compilers are one of my favorite topics, largely because unlike several of the topics, they lurk _just_ out of my reach instead of miles over my head. The math involved is relatively sensible, unlike, say, 3D textures, which requires Digital Signal Processing knowledge, which requires much calculus. I hate calculus.
Compilation's basically a two-part problem. Lexical analysis and parsing. In lexing, you take a stream of characters and differentiate them into 'tokens'. In English, the tokens are words and punctuation. So output from the lexer for English (if one existed, which it doesn't) would be something like:
PRONOUN("I") VERB("am") INDEFINITE_ARTICLE("a") NOUN("fish") COMMA(",") VERB("feed") PRONOUN("me") PERIOD(".") |
Once a string has been lexed, it's fed into a parser, which makes _sense_ of it. Parsers are made up of a set of rules called a grammar. Just like English, a grammar delineates what the legal ways to put together tokens are.
So, something like this would be the beginnings of an English grammar - I'll be nice and put it into english phrases rather than the standard way grammars are specified (It's called Backus-Naur Form or BNF, and it's actually pretty readable... for computer languages. AFAIK, you can't actually get English into BNF, quite. Too fucked up a grammar structure)
|
And so on.
Once the Parser has gathered certain collections of stuff, say, a PHRASE in the above example, it takes some kind of action. In computer compiler situations, this action is often translating the phrase into machine code. I'd guess that if a similar process were going on in your head, the result would be some kind of comprehension - "Ah, he's a fish. Wait, what?". Eventually, the whole string has been translated into whatever end product you wanted, and you're done parsing, save perhaps for some optimization. "Ah, he _thinks_ he's a fish, and wants fish food. I'll go into the other room and call the nice young men in their clean white coats. This guy's fish slipped off the hook."
um. Like you cared. Most of my quick toss-off projects quickly find themselves in need of customization, and a config file of some sort usually ends up being the answer. Which means that I run into parsing problems relatively often for someone who's not a compiler programmer.
*shrug*