6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Mon May 13, 2024 8:24 pm

All times are UTC




Post new topic Reply to topic  [ 2 posts ] 
Author Message
PostPosted: Tue May 10, 2022 2:54 am 
Offline

Joined: Fri May 05, 2017 9:27 pm
Posts: 858

Some Forth systems for the Commodore 64 have a single word, INTERPRET , to handle interpreting and compiling from a text stream. 64Forth and my own Fleet Forth are two such Forth systems. The QUIT loop executes INTERPRET to process the text stream.
At least one other, Blazin' Forth, uses two different words to handle interpretation and compilation. INTERPRET handles interpretation and ] , rather than just set STATE , is the compiler. Blazin' Forth's QUIT loop executes another word, RUN , to process the text stream.
I personally don't see the appeal of using two separate loops. I suppose some system implementers like the aesthetic of a separate loop for interpreting and another for compiling. The problem is the ease with which STATE can change. Blazin' Forth has some rather convoluted code to manage this.
I ported one of the latest versions of my de-compiler to Blazin' Forth and this is my analysis of Blazin' Forth's QUIT loop.
Here is the probable source for Blazin' Forth's QUIT (probable, because it is theoretically possible to generate the same compiled code from different, but functionally identical, source)
Code:
: QUIT
   BLK OFF [COMPILE] [
   BEGIN
      RP! CR QUERY RUN
      STATE @ 0=
      IF  ." OK"  THEN
   AGAIN ;

Other than having the word RUN in place of INTERPRET , Blazin' Forth's QUIT loop doesn't look unusual. Oddly, Blazin' Forth's word RUN can't be compiled by it's control flow words, but can be compiled by Fleet Forth's control flow words. Blazin' Forth's control flow words are rather restrictive. As an example, IF can have zero or one occurrences of ELSE and there are no words to swap control flow data. Blazin' Forth was written in assembly language, so this is not a problem in this instance.
Here is how RUN would be written using Fleet Forth's control flow words.
Code:
: RUN
   STATE @
   IF
      ] STATE @ NOT
      IF
         INTERPRET
      AHEAD
      CS-ROT THEN
   INTERPRET
   THEN
   THEN ;

This is the source needed to produce the code displayed with SEE .
Here is a cleaner version.
Code:
: RUN
   STATE @
   IF
      ] STATE @ ?EXIT
   THEN
   INTERPRET ;

Blazin' Forth's mystery word, the one with a length of zero and a blank for a name, is found when the text stream is exhausted.
Code:
: X  // TO BE CHANGED TO ZERO LENGTH BLANK NAME
   DONE? ON ; IMMEDIATE

DONE? is a variable in Blazin' Forth which really should have been headerless.
In theory, a compiler which didn't have to test state and an interpreter which also didn't have to test state would be faster; however, they both still have to test state as part of a test for exiting their respective loops.
The compiler loop, ] , needs to exit when the text stream is exhausted or if the state changes to interpreting. Likewise, the interpreter, INTERPRET , needs to exit when the text stream is exhausted or if the state changes to compiling. Actually, it doesn't have to test for a state change. If the compiler exits while the state is still compiling, it is because the text stream was exhausted and the interpreter will also exit for the same reason. I suppose the reason Blazin' Forth's interpreter checks for a state change is so it can use the same word to test for end of loop. That word is QUIT?
Code:
: QUIT?  ( F1-- F2 )
   STATE @ <>
   DONE? @ OR
   DONE? OFF ;

Here is the compiler. Notice the test at the bottom of the loop.
Code:
: ]
   STATE ON
   BEGIN
      ?STACK EXISTS? DUP
      IF
         0<
         IF  ,  ELSE  EXECUTE         
      CS-SWAP ELSE
            DROP NUMBER DPL @ 1+
            IF
               [COMPILE] DLITERAL
            ELSE
               DROP [COMPILE] LITERAL
            THEN
         THEN
      THEN
      TRUE QUIT?
   UNTIL ;

and the interpreter loop.
Code:
: INTERPRET
   BEGIN
      ?STACK EXISTS?
      IF
         EXECUTE
      ELSE
         NUMBER DPL @ 1+ 0=
         IF  DROP  THEN
      THEN
      FALSE QUIT?
   UNTIL ;

I suspect that the overhead of the test word QUIT? negates any speed advantage of this approach. I can't test Blazin' Forth's approach against Fleet Forth's without writing yet another kernel since Fleet Forth already compiles and interprets about fifty percent faster than Blazin' Forth. Both of these Forth systems use Indirect Threaded Code.

With Blazin' Forth's approach, the compiler begins compiling before the word executing it can finish. Because of this, it is difficult to effectively use colon within another definition. The halting colon definitions are an example of where this is a problem.

The latest change to Fleet Forth's interpreter makes it relatively easy to try something like this approach without the problem of premature compilation (starting to compile words before colon ( : ) , or any word executing colon, finishes.)
Fleet Forth's new INTERPRET is defined like this.
Code:
: INTERPRET
   BEGIN
      PAUSE NAME C@ 0EXIT
      HERE I/C
   AGAIN ; -2 ALLOT

The PAUSE is to allow multitasking while interpreting or compiling. There is no word with a blank name and length of zero. If the text stream is exhausted, WORD returns a string with length zero and this is tested. I/C is a deferred word which normally executes (I/C) to process a single blank delimited string.
I built an experimental kernel which had the words INTERPRETER and COMPILER to respectively interpret or compile a single word. ] set the state to compiling and set I/C to COMPILER . [ set the state to interpreting and set I/C to INTERPRETER . INTERPRETER and COMPILER were headerless. To improve performance, ] and [ were primitives.
When the system was built on top of this experimental kernel, the only other changes needed were to define this:
Code:
: RI/C  ( -- )
   STATE @
   IF  ] EXIT  THEN
   [COMPILE] [ ;

and replace the two occurrences of the phrase
Code:
   ['] (I/C) IS I/C

with
Code:
   RI/C

since (I/C) didn't exist in the experimental kernel.
When test loading the same 34 screens of Forth source from blocks in ram (to eliminate the overhead of disk drive access) the experimental kernel was almost a quarter of a percent faster at loading these 34 screens. hardly any speed improvement at all. Certainly not enough for me to pursue this approach.
I also tested the source for the halting colon definitions with the experimental system and it compiled fine without any changes. The halting colon definitions also worked fine. A big improvement over the situation with Blazin' Forth.
This approach does introduce a slight complication. The experimental system has two pair of states instead of two states for compiling and interpreting. One pair of states is STATE is true and I/C is set to COMPILER . The other is STATE is false and I/C is set to INTERPRETER . Granted, each pair is tightly coupled and shouldn't get out of sync with STATE being true and I/C set to INTERPRETER for example. On the other hand, anything that can go wrong...

Another idea I've heard of is to have the interpreter and compiler running as separate loops, but have them run as co-routines. The words ] and [ would only change STATE .
So, any thoughts on this. Is there any reason for the compiler and interpreter to be separate loops other than aesthetics?


Top
 Profile  
Reply with quote  
PostPosted: Sat May 14, 2022 1:11 am 
Offline

Joined: Fri May 05, 2017 9:27 pm
Posts: 858

I should have mentioned that the experimental kernel was about 40 bytes larger. The experimental system was 53 bytes larger (because of the need for the word RI/C) . Not a very large increase, but the speed improvement was marginal at not quite a quarter of a percent.
As for a system with both an interpreter and compiler loop, both running as coroutines, I do not believe there would be a speed advantage if each loop had to test state and have the coroutine word within an IF clause.
Here is an interesting possibility. Each loop has a no-op word at the beginning of its loop as a place holder. [ clears STATE , stores the no-op word at the place holder in the interpreter loop and stores the coroutine word in the place holder of the compiler word. Likewise, ] sets STATE , stores the no-op word at the place holder in the compiler loop and stores the coroutine word in the place holder of the interpreter loop.
Since these loops would be coroutines, they could both have a way to exit from two levels deep, rather than the usual one with the phrase
Code:
   R> DROP EXIT

or a code word such as
Code:
CODE 2EXIT
   PLA  PLA  ' EXIT @ JMP
   END-CODE

but this shouldn't be necessary. When one loop exits because the text stream is exhausted, it will exit into the other loop. This loop will also exit because the text stream is exhausted.
I do not know if this approach will be any faster. As with the other approach using two separate loops for the interpreter and compiler, I believe it will be bigger.
Any thoughts?
[Edit: corrected a typo]


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 2 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 6 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: