How to create a programming language

How do I create my own programming language and a compiler for it?

Basically, your question is, "How are computer chips, instruction sets, operating systems, languages, libraries, and applications designed and implemented?" This is a billion dollar global industry that employs millions of people, many of whom are specialists. You may want to focus your question a little more.

That means I can take a leap at:

I can't understand how people create programming languages ​​and develop compilers for them.

It's surprising to me, but a lot of people consider programming languages ​​magical. When I meet people at parties or whatever, when they ask me what I'm doing, I tell them I design programming languages ​​and implement the compilers and tools, and it's surprising how often people - professional programmers mind you - say " Wow, I never thought about it, but yeah, someone has to design these things. " It is as if they had thought that languages ​​would only emerge entirely with tool infrastructures around them.

They don't just appear. Languages ​​are designed like any other product: by carefully making a number of tradeoffs between competing possibilities. The compilers and tools are built like any other professional software product: by solving the problem, writing one line of code at a time, and then putting the resulting program to the test.

Language design is a big issue. If you are interested in designing a language, the first thing to consider is the shortcomings in a language you already know. Design decisions often result from considering a design flaw in another product.

Alternatively, you can consider a domain that you are interested in and then design a domain specific language (DSL) that provides solutions to problems in that domain. You mentioned LOGO. This is a great example of a line drawing domain DSL. Regular expressions are DSL for the domain "find a pattern in a string". LINQ in C # / VB is a DSL for the domain "filter, connect, sort and configure data". HTML is a DSL for the domain describing the layout of text on a page, etc. There are many domains that are suitable for speech-based solutions. One of my favorites is Inform7, a DSL for the domain "text-based adventure game". It's probably the most reputable, top-level programming language I've ever seen. Pick a domain you know about and think about how you can use language to describe problems and solutions in that domain.

When you've sketched out what your language should look like, try exactly write down the rules for determining a legal and illegal program. Usually you want to do this on three levels:

  1. lexical: What are the rules for words in the language, which characters are legal, what do numbers look like and so on.
  2. syntactically: How do words of language combine to form larger units? In C #, larger entities are things like expressions, statements, methods, classes, and so on.
  3. semantically: How can you find out what the program is in a syntactically legal program makes?

Write these rules down as accurately as possible. If you do this well, you can use this as a basis for writing a compiler or interpreter. Take a look at the C # specification or the ECMAScript specification to see what I mean. They are full of very precise rules describing what constitutes a legal program and how to find out what to do.

One of the best ways to start writing a compiler is by using a High-level language-to-high-level language compiler to write. Write a compiler that takes strings in your language and spits out strings in C # or JavaScript, or any other language you are familiar with. Then let the compiler for that language take care of converting it into executable code.

I'm writing a blog about designing C #, VB, VBScript, JavaScript, and other languages ​​and tools. If this topic interests you, give it a try. http://blogs.msdn.com/ericlippert (historical) and http://ericlippert.com (current)

In particular, you might find this post interesting; Here I am listing most of the tasks that the C # compiler does for you during its semantic analysis. As you can see there are many steps. We divide the big analysis problem into a number of problems that we can solve individually.

http://blogs.msdn.com/b/ericlippert/archive/2010/02/04/how-many-passes.aspx

If you are looking for a job that does these things when you are older, you should come to Microsoft as a college intern and try to get into the development department. That's how I got to my job today!