-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathcifgen.mi
1329 lines (1163 loc) · 55.9 KB
/
cifgen.mi
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
dnl $ Id: $
dnl Copyright{2000,2001}: Albert van der Horst, HCC FIG Holland by GNU Public License
undefine({worddoc})
\input texinfo
@setfilename thisfilename
@afourpaper
@settitle Generating ciforth's
@setchapternewpage odd
@titlepage
@title ciforth Manual
A system to generate
a ciforth together with
its documentation.
@author Albert van der Horst
Dutch Forth Workshop
@page
@c @vskip Opt plus 1fill
Copyright @copyright{{}} 2000 Dutch Forth Workshop
Permission is granted to copy with attribution.
Program is protected by the GNU Public License.
@end titlepage
@node Top, , ,
@chapter Overview
forthvar({ci86.gnr}) is a system to generate ciforth in all its
configuration's.
This is a configurators manual.
For each ciforth there is a corresponding documentation;
there is however just this one documentation for the generic
system.
It is assumed that you are familiar with Forth and with ciforth
in particular.
Linux is used for a development system,
and the main tool is forthprog({m4}) , the macro preprocessor.
@pindex m4
This generates an assembler source
file and a raw documentation file
out of the single generic source, controlled by a configuration
file.
In addition there is a file with blocks, that is common to all Forth's.
For further processing you need an assembler, such as forthprog({nasm})
and one or more documentation tools, such as forthprog({info}). The raw
documentation file can be ordered only by a more sophisticated tool
than the usual forthprog({sort}), called forthprog({ssort}). The
@pindex ssort
particulars of all this depend on the actual configuration chosen. It
contains in itself some provision such that it can be loaded on
any of the ciforth systems, independently of whether they are 16 or 32
bits or whether they can or cannot handle direct access of the video memory.
@chapter Non-technical background.
@section Legalese
The Forth's called ciforth are made available through the
DFW .
All publications of the DFW are available under GPL, the GNU public license.
The file COPYING containing the legal expression of these lines must accompany it.
This forthsamp({ci86.gnr}) system is protected by GPL.
This applies to the generic source, the macro files and the Forth
source in the block file.
@subsection Copyright of the ciforth's generated by this tool.
A ciforth generated by ci86.gnr is probably not a derived work
(a thesis written in TeX is not a derived work from TeX).
So DFW separately claims copyright for the different versions of
ciforth generated by her using this tool.
The following is present in all documentation of ciforth's:
forthquotation
Because Forth is ``programming by extending the language'' the GPL
could be construed to mean that systems based on ciforth
always are legally obliged to make the source available.
But we consider this ``fair use in the Forth sense''.
forthendquotation
In addition to the GPL the DFW states the following:
forthquotation
The GPL is interpreted in the sense that a system based on ciforth
and intended to serve a particular purpose, that purpose not being a
``general purpose Forth system'', is fair use of the system, even if it
could accomplish everything ciforth could, under the condition that the
ciforth it is based on is available in accordance to the GPL rules,
and this is made known to the user of the derived system.
Consequently, for these systems the obligation to make the source available
does not apply.
forthendquotation
@section Legal matters
My extensions are GPL-ed or library GPL-ed.
I transferred the copyright to the Dutch Forth Workshop, a foundation that
supports Forth and defends the GPL.
The original figforth is public domain and is still available.
@section Rationale
What you find here is a Forth for the Intel 86. It is an ISO
as of old. Complying in detail with ISO for the CORE wordset at least.
This has bee split off of a similar generic Forth, that is intended
to be the last of the fig-Forth's. This generic system is no longer
maintained, but the last fig-Forth is, in the sense that if there
ever should be found a bug, it can be fixed. It also features
the Fig glossary, which is available
in electronic from for the first time in history. I shamelessly
copied from it.
Apart from being ISO, the Forth you have here is similar in many
respects to fig-Forth.
The motivation for having this type of Forth available follows from its
characteristics. It is available as an assembler source, and it is an
indirect threaded Forth.
An assembler source has distinct advantages for getting started from
nothing. An engineer might balk at the description of how to use a meta
compiler, but feels at ease with a (much larger) assembler manual.
(The popularity of eForth proves how popular assembler source is.
I do not think that system has much merit, besides that.)
Although speed is currently in fashion, using subroutine threaded Forth's
with optimizers, indirect threading is the preferred choice for some
applications. I did this work, because I needed it.
I have also the firm belief that an optimizer on an indirect threaded system
has more information to work with and can ultimately outperform any
other system in speed.
@section History
From the introduction to the figforth installation manual:
forthquotation
The figforth implementation project occurred because a key group of Forth
fanciers wished to make this valuable tool available on a personal computing
level. In June of 1978, we gathered a team of nine systems level
programmers, each with a particular target computer. The charter of the
group was to translate a common model of Forth into assembly language
listings for each computer. It was agreed that the group's work would be
distributed in the public domain by FIG.
We intend that our primary recipients of the Implementation Project be
computer users groups, libraries, and commercial vendors.
We expect that each will further customize for particular computers and
redistribute. No restrictions are placed on cost, but we expect faithfulness
to the model. FIG does not intend to distribute machine readable versions,
as that entails customization, revision, and customer support better
reserved for commercial vendors.
Of course, another broad group of recipients of the work is the community of
personal computer users. We hope that our publications will aid in the use
of Forth and increase the user expectation of the performance of high level
computer languages.
forthendquotation
@subsection Deviations of the FIG model
The first version of ciforth complied faithfully to the fig model,
at least as faithfully as is customary. Now it is ISO, which
means a lot of details are changed about how words work.
In the following we will discuss not those details, but the changes
made to the general build up of the Forth.
(In fact some details are discussed. They are of little interest and
they will move to an appendix in the next version.)
The rigid subdivision in 7 area's was never adhered to.
In particular the boot up parameters
are not up front as CP/M and MS-DOS require a 100H byte reserved
area there.
There is mention of forthvar({(KEY)}) being ``implementation dependent code''
but these were not often present in
fig implementations.
This was based on the idea that there was some EPROM with console commands.
This has been replaces by calls to an operating system, that do not
comply with a simple function that could be called.
Here
the code definitions for forthcode({KEY}) itself
become implementation dependent code, but often it can written in
high level.
All documentation is now accurate but only claims to describe
ciforth.
The forthcode({RUBOUT}) key is a bona-fide forthcode({USER}) variable
and now has a name.
DR0 and DR1 are removed. There is only one consecutive mass storage area, be
it a disk or a file.
The assumption in using forthcodeni({OFFSET}) was that you have two identical floppy drives
and no hard disk. That is nowadays extremely unlikely. Instead I put forthcodeni({OFFSET})
to good use to screen off a part of the floppy that must not be used (such
as an MS-DOS directory or the hard disk part that contains the forth system.)
forthvar({MOVE MON BLOCK-READ BLOCK-WRITE DLIST}) are not present.
Altering OUT to influence formatting doesn't work here, nor on
any figforth I know off.
forthcode({+ORIGIN}) now points to the first user
variable instead to some not well defined start of the boot image.
The layout has not changed, so the negative offsets can be used for
the other data traditionally there.
Their indices have not changed, but there is now a boot-up
parameter for each and every user variable. A system with other boot
parameters can now be generated in an even more portable fashion by using
phrase like forthsamp({7 CELLS +ORIGIN}).
Where possible installation dependent code is using a generic call to the
operating system in particular MS-DOS BIOS or LINOS. This mostly results in
the installation dependent code to be high level.
The false urban legend that one could forthcodeni({FORGET}) forthcodeni({TASK})
has been replaced by an accurate description of forthcodeni({TASK}).
The following words have been documented for the first time:
forthcodeni({FLUSH}) forthcodeni({CURRENT}) forthcodeni({2DUP}) forthcodeni({RP_comat_}) forthcodeni({U.}).
Some non-substantial deviation of the original FIG source have been made
for good reasons.
The FIG philosophy is that sectors, blocks and screens must be compatible, but
may be all different. The original 8086 FIG had one sector for a block. I
changed that in having one block for a screen. This is a boon for those
wanting to ISO-fy the sources.
The way I coded the character I/O points ahead to vectoring
forthcodeni({TYPE}) and forthcodeni({EXPECT}) rather than
forthcodeni({EMIT}) and forthcodeni({KEY}) . This way I can have the
host system handle the rub out key.
I added generic words for accessing system resources
forthcodeni({BIOS}) , forthcodeni({BDOS}) and forthcodeni({LINOS}) .
(See subsection The joy of genericity.)
Some real errors were fixed:
forthenumerate
forthitem
The redefine forthsample({NULL}) bug is fixed.
It is no longer possible to redefine this word,
that handles the refill of the forthcode({TIB}),
by typing a <ret> immediately after a defining word.
forthitem
Forgetting part of a vocabulary, other than the forthcode({FORTH}) vocabulary
no longer crashes.
forthitem
Loading a screen with characters having an 8th bit set,
no longer crashes.
forthendenumerate
@section Evolution of ciforth
The first version of
ciforth was in fact the
figforth for the 8086
that was put in the framework of this manual.
By adding a 32 bits macro file, programming
I/O for Linux, programming I/O with non-obsolete MS-DOS calls
and a way to switch to protected mode,
this figforth came available in all ciforth configurations.
The RCS version numbers of the generic file fig86.gnr
are in the 2-branch and the latest version is available still.
(The 1-branch was experimental).
This version has however an manual not split between a generic and
a user part. But the user part of the manual forthemph({is})
generated from the generic source.
This version 2 can be seen as a 32-bit figForth.
The third and fourth
versions of ciforth (RCS branch 3 and 4)
are generated according to this manual.
As you see there is little pertinent information about these Forth's
in this manual.
All the information you need to use it is in the user manual,
generated with that version.
Branch 3 evolvestowards an ISO compatible system.
Version 4 is a stable maintained ISO compatible system,
a balance between
technical criteria and compatibility issues.
@section Source
In practice the GPL
means (note: this is an explanation and has no legal value!)
They may be
further reproduced and distributed subject to the following conditions:
The three file comprising it must be kept together and in particular
the reference section with the World Wide Web sites.
This Forth builds on figforth, for its source see the next section.
The maintainer can be reached at forthmail({ciforth@@spenarnc.xs4all.nl})
@section Acknowledgment
ciforth is based on the figforth
of Charlie Krajewski and Thomas Newman, Hayward, Ca.
This figforth (as are all figforth's) is public domain.
It is still available via taygeta. And of course kudos to FIG.
forthurl({ftp://ftp.forth.org/pub/Forth/compilers/native/dos})
This original version is public domain according to the
following statement:
forthquotation
All publications of the Forth Interest Group are public domain. They may be
further reproduced and distributed by inclusion of this credit notice:
This publication has been made available by the Forth Interest Group,
P. O. Box 1105, San Carlos, Ca 94070
forthendquotation
I also want to thank J. E. Smith, Philadelphia for another fig Intel86
implementation also still obtainable from
forthurl({http://www.simtel.net/pub/simtelnet/msdos/forth/fig86.zip})
This is a fairly good documented FIG Forth for IBM PC, but its
"Seattle Computer 8086 assembler" format makes it less practical.
@chapter Background.
If you are a Unix and a Forth guru, you can skip this chapter.
_VERBOSE_({ If you think you are,
you can read this chapter and discover you are not.})
This chapter is about pervading concepts and
how tools are used, conceptually.
@section Orthogonality
The concept of orthogonality is central to this effort.
Orthogonality means that different aspects of configuration
(in this case)
are made independent of each other.
For example, ciforth can be bootable or started by MSDOS,
it can be assembled by forthprog({nasm}) or by forthprog({MASM.EXE}) .
@pindex nasm
@pindex MASM.EXE
These two choices can be made independently from each other,
and every combination ought to work.
Each choice is associated with file with macros for forthprog({m4}) ,
so ideally if you need to modify how forthprog({nasm})
assembly source is generated to you only need to change the file
forthfile({nasm.m4}).
This is, of course, as far as it goes.
Try as you may to separate all information about header layout
in the forthfile({header.m4}) configuration file,
a change to the order of the fields in a header will certainly have
it impact at certain places in the source.
@section Metacompilation is outdated
Meta compilation, the generation of a new version of a Forth system
by ``similar tools as compilation'', was invented for the cassette based
computer system of the late seventies.
There may be a motivation for using metacompilation to generate
a similar forth for a forthemph({different}) processor or system.
This would properly be called cross-compilation, by the way.
In a half-decent (or better) disk operating system like MSDOS the use
of meta-compilation is a mistake at the management level.
We want our Forth to be able to generate standalone programs anyway.
(a forthdefi({turnkey}) facility.)
So what do we need
forthenumerate
forthitem
A facility to save a running system with all what is loaded on it,
in the configuration it currently has.
forthitem
A facility to remove parts of a running system, that are not needed
for an application after it has been build. (E.g. the assembler.)
forthitem
A facility to optimize some parts of a system. (Then remove the,
possibly large, optimizer.).
forthendenumerate
If you have the first facility,
you can build a powerful Forth from a small kernel and regular source code.
If you have all of them, you can build a truly,
optimal Forth from a small kernel.
You need not ``a similar tool as compilation'',
you just need ``compilation'' .
The forthcodeni({SAVE-SYSTEM}) facility of course requires
in depth knowledge of the operating system.
This doesn't mean it is cumbersome or difficult.
Under Linux we need
forthexample(
{HEX
\ The magic number marking the start of an ELF header
CREATE MAGIC 7F C, &E C, &L C, &F C,
\ Return the START of the ``ELF'' header.
: SM BM BEGIN DUP @ MAGIC @ <> WHILE 1 CELLS - REPEAT ;
\ Return the VALUE of ``HERE'' when this forth started.
: HERE-AT-STARTUP ' DP >DFA @ +ORIGIN @ ;
\ Save the system in a file with NAME .
: SAVE-SYSTEM
\ Increment the file and dictionary sizes
HERE HERE-AT-STARTUP - DUP SM 20 + +! SM 44 + +!
U0 @ 0 +ORIGIN 40 CELLS MOVE \ Save user variables
\ Now write it. Consume NAME here.
SM HERE OVER - 2SWAP PUT-FILE ; DECIMAL
})
@section How m4 is used.
The Unix macroprocessor forthprog({m4}) is very powerful indeed.
@pindex m4
Testimony is that the description of its usage in here
is longer that its man-pages.
You know
forthprog({m4}) is a text substitution tool.
A macro is like a function. In the macro call the text is replaced by
the text present in the function.
Within the text the placeholders for the parameters are replaced
by the actual parameters.
In forthprog({m4}) the placeholders are forthsamp({$}{1}) ... forthsamp({$}{9}).
Parameters can be passed, and any (even multiline)
text can be given as a parameter, provided it is quoted.
We will use forthsamp({_lbracket_}) and forthsamp({_rbracket_}) (braces) throughout.
This is convenient, because they are
not used in a Basic Forth system
and they are
special anyway (e.g. for TeX).
The use of quotation is very critical at times,
and the find points are not covered in the following.
@subsection Customization
Simple customization can be done by forthprog({m4}) as follows:
forthsamp({define(_lbracket_version_rbracket_,2.149)})
Within the text treated the version number is substituted.
@subsection Selection
Selection, often one of alternatives, is in general done as follows
forthsamp({_BITS16_(32)_BITS32_(64)_BITS64_(128)}) ,
which gives, of course, the size of a double number.
This is accomplished by
forthsamp({define(_lbracket__BITSxx_ _rbracket_,_lbracket_$1 _rbracket_)})
for the actual bitsize and
forthsamp({define(_lbracket__BITSxx_ _rbracket_,)})
for others.
Selections can be nested within other forthprog({m4}) macro construct.
As in
forthexample(
{{_VERBOSE_}_lbracket_({_BITS64_}(_lbracket_The possibility to cycle through all (64-bit)
numbers by {forthsamp}(_lbracket_0 0 DO ... LOOP_rbracket_) is very useful indeed._rbracket_)_rbracket_)})
Here you see at work, apart from forthmacro({_BITS64_}) , the macro forthmacro({_VERBOSE_})
that allows (if turned on)
verbosity that can help understanding but is not always appreciated.
You also see forthmacro({forthsamp}) that is in fact
a markup to indicate we have a piece of Forth code there.
Selections can be used to throw out a block of
word definitions and their documentation as a whole.
For example words accessing I/O ports are not available in a Linux Forth,
as they would only lead to privilege violations.
The braces are essential here.
Without it the introduction of a comma somewhere in the text
results in forthprog({m4}) interpreting the remainder as a second parameter,
which it will ignore.
@subsection A postponed markup language.
Just say forthsamp({forthcode(_lbracket_+LOOP _rbracket_)}) to indicate that you want
formatting as for ``code'' words.
Later you can decide to use
forthbreak
forthsamp({define(_lbracket_forthcode _rbracket_,_lbracket__comat_code_lbracket_$}{1_rbracket__rbracket_)})
forthbreak
for forthsamp({texinfo}) or
forthbreak
forthsamp({define(_lbracket_forthcode _rbracket_,_lbracket_<B>$1</B> _rbracket_)})
forthbreak
for forthsamp({html})
.
@subsection Defining structures
Some macro calls must be considered to define a structure, in particular
forthsamp({worddoc}) .
Suppose we have a list of structures, meaning that the first person is
a child of the second and third person:
parents(_lbracket_Alice_rbracket_,_lbracket_Mary_rbracket_,_lbracket_John_rbracket_)
parents(_lbracket_Fred_rbracket_,_lbracket_Mary_rbracket_,_lbracket_Henry_rbracket_)
parents(_lbracket_Aayilah_rbracket_,_lbracket_Sjantil_rbracket_,_lbracket_Bodaji_rbracket_)
...
With
forthsamp({define(_lbracket_parents_rbracket_,_lbracket_$2_rbracket_)}) we get a list of (you guessed) the mothers.
The usage of forthmacro({divert()}) can best be explained with an example in this context.
forthexample(
{{define(_lbracket_parents_rbracket_,
_lbracket_{divert(3)dnl}
$}{2
{divert(6)dnl}
$}{3
_rbracket_)}})
will give out the mothers on channel 3 and fathers on channel 6.
The output will be concatenated,
but all mothers and all fathers stay together.
For forthsamp({dnl}) see the forthprog({m4}) man-page.
@subsection Defining lists
By using an extra pair of braces you can have a list in forthprog({m4}) .
So forthsamp({_lbracket__lbracket_A_rbracket_,_lbracket_B_rbracket_,_lbracket_C_rbracket_,_lbracket_D_rbracket__rbracket_}) is
a single parameter to a macro and can
be passed to other macro's as a whole.
The outer braces are removed and
without special measures (reinstalling extra braces again)
the macro called forthemph({sees})
the comma's and concludes there are four parameters.
This is put to good use in the ``See also'' and ``Test''
fields of the forthsamp({worddoc}) structure.
These fields may have zero or more parts.
The ``Test'' field contain the tests in the odd fields, and the
expected outcome in the following even fields.
@subsection Defining aliases
Sometimes you need aliases, i.e. other names for macro's,
Although it doesn't properly belong here as a technique, I want
to mention it, because the amount of brackets is hard to sort
out. An alias is useful in a transition period, where you want
to rename something, but where you want to be able to do that
gradually on a file by file basis.
forthsamp({define(_lbracket__OLDNAME__rbracket_,_lbracket__NEWNAME_(_lbracket_$1_rbracket_,_lbracket_$2_rbracket_)_rbracket_)})
After forthvar({_OLDNAME_}) is phased out everywhere this definition can be deleted.
Note that for this to work all parameters applicable to forthvar({_NEWNAME_}) must be
taken into account, the two shown here are just an example.
@subsection Impress the crowd
By using macro's to define other macro's, then pass the result through
forthprog({m4}) another time, severe stress can be laid upon the intelligence
of the everyday person.
The very inconvenient way nodes must be linked in texinfo even forced
me to define part of the macro in one macro and the remainder in
another.
@section How forthprog({ssort}) is used
The sorting tool forthprog({ssort}) can order multiple field records, with
different sorting criteria for each field.
The fields can be defined by regular expressions, such that
the forthsamp({worddoc}) structures can be sorted by name, or by wordset
then by name, or in about any way you want.
Because such a tool didn't exist, I had to write it.
@subsection Analyzing forthsamp({worddoc})
forthprog({ssort}) captures the structure of a forthsamp({worddoc}) as follows:
forthsamp({^worddoc(_lbracket__comat__rbracket_,_lbracket__comat__rbracket_.*\n$worddoc})
The part between forthsamp({^}) and forthsamp({$}) matches the record.
The part after the last forthsamp({$})
is for synchronization, to make sure the record doesn't end early.
This would result in an error ``not according to structure'': the next
line doesn't start with ``worddoc'' and so it just doesn't match the record
description.
The forthsamp({$}) is merely a separation, (newlines are indicated by forthsamp({\n}) ).
The forthsamp({.*}) matches anything, including new lines.
But it isn't greedy as in ordinary regular expressions,
because not being stopped by forthsamp({\n}) ,
it would match the whole file.
Here it tries to match as little as possible.
forthsamp(_comat__rbracket_) is shorthand for forthsamp([^_rbracket_]*_rbracket_$)
so a ``sequence of anything except
right braces followed by a right brace''.
It also contains the forthsamp({$}) to
mark the end of a field.
@subsection Sorting fields
Once we know what the fields are,
forthsamp({-M 1S2S }) sorts on the first field
and within that field on the second. We just use the ordinary ASCII collating
sort, indicated by forthsamp({S}) .
@chapter Structures and processes
@section The generic source file
The generic source file forthfile({ci86.gnr}) mostly
consists of Intel assembly code, with which, I assume,
you are familiar. All macro's in the following are forthfile({m4}) macro's.
Words are divided in small (<20) groups of cooperating words,
the forthdefi({wordset}). See also ``thinking Forth''.
The things that differ among assemblers, are taken care of by
simple macro's, e.g. forthmacro({_COMMENT}) starts a comment.
Most of the time they don't have parameters.
The selection of parts that go or don't go into a particular configuration
is done by multiline macro's, generally with a call on a separate line.
Such as:
forthexample({
{_HIGH_BUF_}(_lbracket_
BUF1 EQU EM-(KBBUF+2*2)*NBUF ;_lbracket_ FIRST DISK BUFFER_rbracket_
STRUSA EQU BUF1-US ;_lbracket_ User area_rbracket_
_rbracket_);{_END_}(_lbracket_ _HIGH_BUF__rbracket_) })
Note how comments are protected from macro expansion by quotation.
The forthmacro({_END_}) is an adornment. It expands to nothing.
So it doesn't show up in the output,
but it helps to keep the generic source organized.
The forthmacro({worddoc}) macro defines a structure with additional information
of a word.
Generally it is placed in front of the word.
The same word can be found several times in the input file,
but only one is selected in a particular configuration.
The same goes for the corresponding forthmacro({worddoc}) .
forthbreak
Its fields are:
forthenumerate
forthitem
Wordset name.
forthitem
Word name.
forthitem
Pronunciation.
This is a pure textual and pronounceable identification of the word.
It is also used in forthfile({texinfo}) that doesn't handle special characters well.
forthitem
Stack effect.
The stack effect obeys all the conventions put forth in the user manual.
forthitem
Properties.
Properties are i.a. immediate and such, and the standards with which
this word complies.
Again this is described in the user manual.
forthitem
Description.
forthitem
References.
This is a list of names of other Forth words,
that can be studied to better understand this one.
forthitem
Tests.
This is a list.
The first and all other odd members is a test,
code that can be passed to Forth.
The second and all other even members is the expected outcome of the
preceding test.
forthendenumerate
forthmacro({worddoc}) are such that a structure starts with forthsamp({worddoc( }) and
end with a forthsamp({_rbracket_)}) at the end of a line.
This means that a worddoc
can be simply skipped if it occurs in Forth code,
by defining a word forthcode({worddoc(}) that reads and ignores source up to the
end sentinel.
The forthmacro({worddocchapter}) macro defines a wordset.
It has the same fields as a forthmacro({worddoc}) macro,
but most are left empty.
It is primarily used for its ``description'' field,
that is used as an overview description for the wordset in glossaries.
These macro's can be put anywhere,
but take care to exclude macro's for wordsets that are not present.
@section The process
The ultimate information about how a ciforth is generated are the makefile's :
forthfile({Makefile}) and forthfile({test.mak}) .
The process of generating a program proceeds along the following steps:
forthenumerate
forthitem
Generated the assembler source from the generic source via a configuration file.
The file suffix indicates which assembler to use.
forthitem
Generate an object file.
forthitem
Link the object file.
forthendenumerate
Once you have an assembler file,
you can do what you want with it.
Proceeding from an assembler source file to a binary is in general straightforward.
The process of generating program's documentation
(TeX and info)
proceeds along the following steps:
forthenumerate
forthitem
Generated the raw glossary documentation
from the generic source via a configuration file.
The file suffix is forthfile({.rawdoc}) .
forthitem
Sort the forthfile({.rawdoc}) file,
such that words of a wordset appear together,
and are preceded by a wordset documentation.
The file suffix is forthfile({.mig}) .
forthitem
Generate the glossary documentation from the forthfile({.mig})
by expanding the forthmacro({worddoc}) 's into glossary entries by
forthfile({gloss.m4}) or forthfile({glosshtml.m4}) .
This, for a second time (!), takes into account the configuration file{}_VERBOSE_(
{, to generate exactly fitting information}).
forthitem
Expand the ``postponed markup's '' in forthfile({ciforth.mi})
by the macro's from forthfile({manual.m4}) to
generate the texinfo commands.
This file include all the other forthfile({.mi}) files with postponed markup's.
forthendenumerate
The process for generating html has the postponed markups and the
expansion into glossary entries in the same file forthfile({glosshtml.m4}).
Only the documentation of the glossary enters into html ,
and the forthfile({ciforth.html}) is generated from the intermediate
file forthfile({.mig}).
Generating documents is made more complicated by the requirements for special tables.
For forthsamp({html}) we want an extra alphabetic list of all the words where we click on to
get at the glossary entry immediately.
forthbreak
In forthfile({texinfo}) we need to build complicated menu structures, that refer back and forth.
forthbreak
This is done by separate passes over the forthfile({.rawdoc}) or forthfile({.mig})
files, with other macro's.
@chapter off we go
@section Introduction
What you find here is a Forth for the Intel 86.
Not much more can be said for such a highly configurable system.
But in this section we will try to summarize the common characteristics.
It borrows some philosophy from the old figforth.
It is in fact based on it, and its documentation in first draft
copied from it.
The Forth's are build from
an assembler source, and it is (in general) an
indirect threaded Forth.
The motivation for having this type of Forth available follows from its
characteristics.
An assembler source has distinct advantages for getting started from
nothing.
An engineer might balk at the description of how to use a meta
compiler,
but feels at ease with a (much larger) assembler manual.
Although speed is currently in fashion, using subroutine threaded Forth's
with optimizers, indirect threading is the preferred choice for some
applications.
Furthermore the current trend of subroutine threaded Forth's may very
well be unsuitable for 64-bits processors like the Alpha.
I did this work, because I needed it for my thesis on computer intelligence.
@subsection 32 bits
It is unusual for a forth to be configurable as 16 or 32 bit.
It turned out that the addition of forthcode({CELL+}) goes a long way toward allowing
utilities like a decompiler to be 16/32 bit clean. In the documentation
mostly reference to cells can be made. But the macro's
forthsamp({_BITS_})
forthsamp({_BIT16_}) and
forthsamp({_BIT32_})
can be used to signify the actual number of bits and parts to refer to 16 and 32 bits only
respectively.
@subsection System requirements
This generic version -if suitably built- runs on industry standard hardware
("PC's") : standalone, under Linux and under MSDOS/MSWINDOWS.
To build, you need a version of forthprog({nasm}) , forthprog({TASM.EXE}) or forthprog({MASM.EXE}) on your system. I
recommend forthprog({nasm}) , it is an open source assembler and available on different
platforms, at least MSDOS and Unix. It solves a lot of the design errors I
find in the Intel ways of forthprog({MASM.EXE}) . It generates a binary without a linker.
On the opposite side, e.g. Borland's forthprog({TASM.EXE}) you can buy nowadays only as part
of a giant C++ package.
If you want to use the generic possibilities you will need a Unix system
with all of its tools. I use GNU-Linux (RedHat) and do the makes and version
control on that. If you want your bootable floppies made from Linux to be
MSDOS-compatible you need mtools.
@subsection Assembler sources
The following two assembly sources generated are supplied as a service.
These are in fact just examples. You can generate different ones (see next
section.)
The file forthfile({alone.asm}) can be assembled using forthprog({nasm}) . It includes a boot
sector such that it can boot from a standard floppy on a industry standard
Intel PC. If you have the mtools set (most Linux'es have it) the Makefile
shows you how to make the floppy. On MSDOS you can use forthprog({DEBUG.EXE}) .
If you run on Linux with
forthsamp({mtools}) , forthsamp({make boot}) will do it.
The resulting floppy will even be recognized by
MSDOS, such that you can copy block sources to it.
forthsamp({make moreboot})
will do this from Linux, then you will have forthfile({forth.lab})
available.
forthsamp({make allboot})
will do it all, but it needs a working forth
on Linux for doing some calculations.
Otherwise on MSDOS (I recommended version 3.3, the most stable MSDOS ever)
adapt the example forthfile({genboot.asm}) .
The file forthfile({msdos.msm}) can be assembled
using forthprog({TASM.EXE}) and forthprog({MASM.EXE}).
The resulting Forth
executable can be run off hard disk and respects the file system on it.
It uses the file forthfile({forth.lab}) .
@subsection A generic Forth
As was mentioned before, ciforth has one single source file: the generic forthfile({ci86.gnr}) .
All advantages of assembler source would be gone, if an engineer were
confronted with conditional compilation and lots of code for other systems
he doesn't want to learn or assemblers he doesn't want to use.
So we proceed in two steps. First a clean assembler source is generated from
the generic Forth using configuration files. Then the assembler source is
processed in one of a number of ways, each way familiar to one brand of
engineers.
You can customize at a number of levels.
forthenumerate
forthitem
Configuration files have extension forthsamp({.cfg}) , these are files with forthprog({m4})
commands. They are intended to use at the highest and easiest level of
configuration and contain their own simple usage instructions.
forthitem
forthprog({m4}) files have extension forthvar({.m4}), and control one aspect of genericity, such
as the header layout or the protection mode. You definitely need to know forthprog({m4})
to use these.
forthitem
Assembler files can be customized in the traditional way by adopting
constants, or commenting in source lines. The assembler files are distinct
from the one generic source. No forthprog({m4}), you need only cope with the directives
of your assembler, and will not see any code applicable to other operating
systems or I/O systems. (It is not commented out, it is just not there.)
forthitem
You can adapt the generic source. This is difficult, but gratifying.
If you manage to ISO-fy it, the result is a lot of ISO systems, not just one.
forthendenumerate
@subsection Level 1 customization.
This is assuming you run on Unix.
By specifying what you want in a configuration file you can generate a host
of assembler listings. This is as simple as replacing ``_yes'' with ``_no'' in
configuration files.
See the examples forthfile({msdos.cfg}) and forthfile({alone.cfg}) and the Makefile.
You can find out what the options are by inspecting forthfile({prelude.m4}) .
There is a division of labor between your configuration file
and the forthfile({prelude.m4}) and forthfile({postlude.m4})
files. forthfile({prelude.m4}) sets all variables to defaults,
for sets of alternatives this is NO, waiting to be overwritten.
For the other options it is the most sensible one. You must
include forthfile({prelude.m4}) first in your configuration
file. Then you specify your configuration and include
forthfile({postlude.m4}). forthfile({postlude.m4}) will correct
the options to the most, or the only, sensible ones for that configuration.
It will reject some of the configurations that will not
assemble, or lead to programs that do not work. Then you can
after including forthfile({postlude.m4}) overwrite some of the
sensible defaults. So you can force the generation of source
that is rejected by the assembler anyway.
An example is the default stack size of 64K
for 32 bit programs. Sensible as it is, you may want to have a
32 bit Forth that runs in 64K. You will overwrite that stack
size. Be careful.
With respect to the assembler you can choose between forthprog({nasm}) and forthprog({MASM.EXE}) , with
file extension forthfile({.asm}) and forthfile({.msm}) respectively.
The forthvar({.msm}) are acceptable by
TASM.EXE too. You can generate an equivalent forthfile({.s}) file, but this is
experimental and doesn't lead to a working forth.
With respect to the I/O (words like forthcodeni({EXPECT})
forthcodeni({R\W}) ) you can choose between three on MSDOS. (
forthcode({R\W}) is what was named forthcode({R/W}) in
figForth, but that name is reserved by ISO now.)
You can use dos forthmacro({_CLASSIC_}) in the classic way as with the original. This
means that the floppy is used directly without regard for directory
structures. This uses calls that are declared obsolete.
You can use dos in a modern way. forthmacro({_MODERN_}). This allocates block in the
file with name forthfile({forth.lab}) . This name is available in the string forthcodeni({BLOCK-FILE})
for you to change, also at run time. No (as of 2000 ) obsolete MSDOS calls
are used (Checked against MS-DOS programmers reference "covers through
version 6" ISBN 1-55615-546-8)
You can use the BIOS forthmacro({_USEBIOS_}) No MSDOS interrupts are required.
With respect to I/O on Linux you can choose between c-based and native.
The c-based version may be portable to other I86 unices. The native version
of course not. All Linux versions have their blocks in a file. (Accessing
a floppy in the classic way is perfectly possible -- and implementing it would
be a perfectly pointless exercise.)
With respect to the hosting you can choose between forthmacro({_HOSTED_}) ( forthmacro({_HOSTED_LINUX_}) or
forthmacro({_HOSTED_MSDOS_})) and forthmacro({_BOOTED_}) . ( forthmacro({_BOOTDF_}) or forthmacro({_BOOTHD_}) ).
A hosted version relies on MSDOS or Linux to get the program started.
(It may or may not use MSDOS for I/O, once started.).
A forthmacro({_BOOTED_}) version contains a boot sector, such that
you you can make a standalone version that boots from floppy or hard disk.
A forthmacro({_BOOTED_}) version may very well be startable from plain DOS and its files
visible from DOS.
Of course a forthmacro({BOOTED_}) version that tries to use MSDOS I/O (or Linux) crashes
immediately, so not all versions are useful.
You have a choice between 16 or 32 protected mode and real mode.
Of course on Linux real mode is not an option, (but you could run the
MSDOS emulator). Protected mode Forth's on MSDOS cannot be started from
virtual real mode, e.g. they will not run in a "DOS box" in Windows.
If you manage to specify conflicting options the preprocessor (forthprog({m4})) breaks
off and you can find the exit code in forthfile({postlude.m4}) . Than you can reason back
why this is a conflict. For example error 1000 indicates floppy and hard
disk i/o at the same time.
From forthfile({postlude.m4}) you see that forthmacro({_RWFD_})
and forthmacro({_RWHD_}) are on at the same time.
forthmacro({_RWHD_}) is turned on because you wanted to boot
from hard disk or you specified it yourself in the first place.
Etc.
forthfile({postlude.m4}) does you another favor. It derives
logical consequences, such as once you decide for a
forthmacro({_REAL_}) mode Forth, it must be
forthmacro({_BITS16_}) and you need not specify that_VERBOSE_({
(And yes, we could add a real mode 32 bits Forth, any
volunteers?)}). In particular forthmacro({_LINUX_N_}) or
forthmacro({_LINUX_C_}) define a whole configuration.
@subsection Level 2 customization.
You are on your own here.
@subsection Level 3 customization.
So you have this assembler file,
and it looks like what you want to have, but not quite.
And of course it doesn't work.
@subsubsection My rants
The usual customization in assembler files is possible.
If you use other than 3" floppy disks you have to specify the disk
parameters. Parameters for a 5" HD floppy are present and can be commented
in.
If you do not need a DOS-compatible floppy, you can put the image
immediately after the boot sector. A bootable hard disk version always works
like that.
You can change the default name of the forthcodeni({BLOCK-FILE}) at run time.
If you want to change the header layout, you will find that the way headers
are done via MACRO's make it more pleasant to use the generic listing.
If you may want you can use this as a starting point for generating a whole
other Forth (like me).
If you want to boot into your 20 Gbyte disk (like me), you probably have a
version 3.0 super modern LBA BIOS. There is no file system, just 20,000,000
blocks (and yes a 16 bit system would be inconvenient). If you want to use
an older system you must experiment by using the forthcodeni({BIOS}) word.
(You need not resort to assembler for experimenting.)
Then you can adapt your assembler listing.
@subsubsection FIG's rants
You may want to use the assembly code of this ciforth to
base a new Forth on. If this adversely affects the documentation
I urge you not to do that but to use the generic system.
The following words
are traditionally
the only portion that need change between different
installations of the same computer CPU.
They cannot come close to the capabilities
of the generic system,
and should be used for minor modifications only.
There are five words that need adaptation:
@table @code
forthitem KEY
Push the next ASCII value (7 bits) from the terminal keystroke to the
computation stack and execute NEXT. High 9 bits are zero. Do not echo
this character, especially a control character.
forthitem EMIT
Pop the computation stack (16 bit value). Display the low 7 bits on the
terminal device, then execute NEXT. Control characters have their
natural functions.
forthitem ?TERMINAL
For terminals with a break key, wait till released and push to the
computation stack 1 if it was found depressed; otherwise 0.
Execute NEXT. If no break key is available, sense any key depression as
a break (sense but don't wait for a key). If both the above are
unavailable, simply push 0 and execute NEXT.
forthitem CR
Execute a terminal carriage return and line feed. Execute NEXT.
forthitem R\W
This colon-definition is the standard linkage to your disc. It requests
the read or write of a disc block, be it raw disk or allocated in a file.
@end table
On primitive systems these may be jumps to ROM-code. But generally on i86
facilities like this are available using forthdefi({INT})'s a kind of traps.
These observe operating system protocols and are available as high level forth
code.
@subsubsection FIG's rants : Ram disc simulation
If disc is not available, a simulation of forthcode({BLOCK}) and
forthcode({BUFFER}) may be made in RAM.
The following definitions setup high memory as mass storage.
Referenced ``screens'' are then brought to the ``disc buffer'' area.
This is a good method
to test the start-up program even if disc may be available.
forthexample(
{HEX