Hi experienced people!

I am working on an interpreter of sorts. I would like its scripts to be invokable from the command line - so It would honor the “#!” as a first line, basically by making any line starting with a “#” a comment.

The issue is that I want to be able to read the source code more than once. The first pass will deduce the number of lines, the number of variables, the number of line labels, The beginning of the second pass will allocate arrays (malloc) to hold the program and its data, then re-read the source to store it internally and fill symbol tables and mark variables. once the source is read the 2nd time the program will begin to execute.

If an interpreted program is mentioned on the command line it would only get one pass at the source, right? That source would come in on standard input, and once read is no longer available.

Is there a way for my interpreter to get the file name instead of the body of the file?

While writing the question I came up with an idea, but I hope there is a better one. I could as a first pass store each line of the program in a known temporary file, then for the second pass I could read that file. I don’t like this but if there is no better way…

  • mrkite@programming.dev
    link
    fedilink
    arrow-up
    2
    ·
    1 year ago

    How big are you expecting these files to be? Instead of a temporary file just read the whole thing into RAM. If someone has a 2 gig script they should expect it to take a while.

    • WasPentalive@lemmy.oneOP
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      I want to make as much space available as I can. I once wrote a small language called tiny. It had a fixed array of statements, that was a hard limit on the size of any program it could run. Because I can read the source and find out how big it is, I can attempt to malloc the needed space to store and run it. That way the program can be as big as memory allows.

    • mrkite@programming.dev
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      1 year ago

      Oh wow, you sure are right… I never tried that before.

      Running ./test.sh which calls a.out which just outputs argv

      0: '/Users/mrkite/a.out'
      1: './test.sh'
      

      edit: and I double checked, it doesn’t modify stdin at all either.

  • Alxe@lemmy.world
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    As far as I recall and for Linux, the shebang is interpreted by the kernel to execute the text file as the input of a given program. Are you talking about adding a shebang to a C source file? If so, this would not work, because the hash could be interpreted as a preprocessor instruction.

    Take into consideration that, in bash, you can use ${0} to get the filename of the current script. If you want the count of lines in your script wc -l ${0} ought do the trick.

    If you’re using C, you could rely on the FILE define for your imolementation but the rest implies reading the source code and then acting on it.

    • WasPentalive@lemmy.oneOP
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      1 year ago

      As far as I recall and for Linux, the shebang is interpreted by the kernel to execute the text file as the input of a given program. Are you talking about adding a shebang to a C source file? If so, this would not work, because the hash could be interpreted as a preprocessor instruction.

      No, the #! is the input to an Interpreter that is written in C. The rest of the file is in “calculator language” (sort of like an HP41C but labels are merged with steps and variables take up one step each but are named so you can have as many as you like). The

      Take into consideration that, in bash, you can use $0} to get the filename of the current script. If you want the count of lines in your script wc -l ${0 ought do the trick.

      I am not getting this, but anyway, I am writing an interpreter in C to run a program written in “Calculator Language”

      |If you’re using C, you could rely on the FILE define for your imolementation but the rest implies reading the source code and then acting |on it.

      What are you saying? I think that word was supposed to be “implementation”, but what is a “FILE define”? The interpreter won’t have the name of the file, will it?

      One thing I might do is provide the name of the calculator language file in the shebang so the interpreter sees it…

      This is the start of a file in calculator language named fizbuz.lnc : #! /bin/lincalc fizbuz.lnc

      The interpreter would see this first line on its standard input and stop reading standard input, and open the file “fizbuz.lnc” but this is non-standard for a shbang line. Also, this won’t work if the shebang is stripped off the input by the shell and not provided to the called program.

    • WasPentalive@lemmy.oneOP
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      1 year ago

      More Background might be helpful - I am writing an interpreter that will read and execute “keystroke programmable RPN calculator program”

      The #! I am asking about will be in the text of the calculator program, it will cause the interpreter to be loaded.

      My understanding is that after that the rest of the file of calculator program steps will be presented to the C program as standard input. But if that is the case, then I only see this input once and have no way to elicit a second pass through the code.