# SQUOZE

SQUOZE (abbreviated as SQZ) is a memory-efficient representation of a combined source and relocatable object program file with a symbol table on punched cards which was introduced in 1958 with the SCAT assembler[1][2] on the SHARE Operating System (SOS) for the IBM 709.[3][4] A program in this format was called a SQUOZE deck.[5][6][7] It was also used on later machines including the IBM 7090 and 7094.

## Encoding

In the SQUOZE encoding, identifiers in the symbol table were represented in a 50-character alphabet, allowing a 36-bit machine word to represent six alphanumeric characters plus two flag bits, thus saving two bits per six characters,[6][1] because the six bits normally allocated for each character could store up to 64 states rather than only the 50 states needed to represent the 50 letters of the alphabet, and 506 < 234.

SQUOZE character codes[1]
Most
significant
digits
Least significant digits
Dec +0 +1 +2 +3 +4 +5 +6 +7
Oct 0 1 2 3 4 5 6 7
Dec Oct Bin 000 001 010 011 100 101 110 111
+0 0 000 space 0 1 2 3 4 5 6
+8 1 001 7 8 9 A B C D E
+16 2 010 F G H I J K L M
+24 3 011 N O P Q R S T U
+32 4 100 V W X Y Z = # / % ) ⌑
+40 5 101 + & - - @ + & - * / \$
+48 6 110 , .

Using base 50 already saves a single bit every three characters, so it was used in two three-character chunks. The manual[1] has a formula for encoding six characters ABCDEF: ${\displaystyle (A*50^{2}+B*50+C)*2^{17}+(D*50^{2}+E*50+F)}$

For example "SQUOZE", normally 36 bits: `35 33 37 31 44 17`(base 8) would be encoded in two 17-bit pieces to fit in the 34 bits as `( 0o220231 << 17 ) | 0o175473 == 0o110114575473`.

A simpler example of the same logic would be how a three-digit BCD number would take up 12 bits, such as 987: `9 8 7`(base 16) `1001 1000 0111`(base 2), but any such value could be stored in 10 bits directly, saving two bits, such as 987: `3db`(base 16) `11 1101 1011`(base 2).

## Etymology

"Squoze" is a facetious past participle of the verb 'to squeeze'.[5][6]

The name SQUOZE was later borrowed for similar schemes used on DEC machines;[4] they had a 40-character alphabet (50 in octal) and were called DEC RADIX 50 and MOD40,[8] but sometimes nicknamed DEC Squoze.

## References

1. ^ a b c d SHARE 709 System Committee, ed. (June 1961) [1959]. "Section 02: SCAT Language; Appendix 1: Table of Permissible Characters; Appendix 3: SQUOZE Deck Format - Chapter 8: Dictionary". SOS Reference Manual - SHARE System for the IBM 709 (PDF). New York, USA: SOS Group, International Business Machines Corporation. pp. 02.00.01 – 02.00.11, 12.03.08.01 – 12.03.08.02, 12.01.00.01. X28-1213. Distribution No. 1–5. Archived (PDF) from the original on 2020-06-18. Retrieved 2020-06-18. pp. 12.03.08.01 – 12.03.08.02: […] Bit Positions Used […] Bit 0 […] Bit 1 […] Bits 2–35 […] Base 50 representation of the symbol with heading character. […] The base 50 representation of a symbol is obtained as follows: […] a. If the symbol has fewer than five characters, it is headed (by blank if it is in an unheaded region). […] b. The symbol with it[s] heading character is left-justified and any unused low-order positions are filled with blanks. […] c. Each character in the symbol is replaced by it[s] base 50 equivalent. […] d. The result is then converted by the following: if the symbol, after each character is rep[l]aced by its base 50 equivalent, is ABCDEF, its base 50 representation is (A*502+B*50+C)*217+(D*502+E*50+F). […] [1][2]
2. ^ Salomon, David (February 1993) [1992]. Written at California State University, Northridge, California, USA. Chivers, Ian D. (ed.). Assemblers and Loaders (PDF). Ellis Horwood Series In Computers And Their Applications (1 ed.). Chicester, West Sussex, UK: Ellis Horwood Limited / Simon & Schuster International Group. ISBN 0-13-052564-2. Archived (PDF) from the original on 2020-03-23. Retrieved 2008-10-01. (xiv+294+4 pages)
3. ^ Jacob, Bruce; Ng, Spencer W.; Wang, David T.; Rodrigez, Samuel (2008). "Part I Chapter 3.1.3 On-Line Locality Optimizations: Dynamic Compression of Instructions and Data". Memory Systems: Cache, DRAM, Disk. The Morgan Kaufmann Series in Computer Architecture and Design. Morgan Kaufmann Publishers / Elsevier. p. 147. ISBN 978-0-12-379751-3. (900 pages)
4. ^ a b Jones, Douglas W. (2018). "Lecture 7, Object Codes, Loaders and Linkers - Final steps on the road to machine code". Operating Systems, Spring 2018. Part of the CS:3620 Operating Systems Collection. The University of Iowa, Department of Computer Science. Archived from the original on 2020-06-06. Retrieved 2020-06-06.
5. ^ a b Boehm, Elaine M.; Steel, Jr., Thomas B. (June 1958). Machine Implementation of Symbolic Programming - Summary of a Paper to be Presented at the Summer 1958 Meeting of the ACM. ACM '58: Preprints of papers presented at the 13th national meeting of the Association for Computing Machinery. pp. 17-1–17-3. doi:10.1145/610937.610953. Archived from the original on 2020-06-06. Retrieved 2020-06-06. (3 pages)
6. ^ a b c Boehm, Elaine M.; Steel, Jr., Thomas B. (April 1959). "The SHARE 709 System: Machine Implementation of Symbolic Programming". Journal of the ACM. 6 (2): 134–140. doi:10.1145/320964.320968. S2CID 16545134. Archived from the original on 2020-06-04. Retrieved 2020-06-04. pp. 137–138: […] There is an interesting feature related to the encoding of symbols for inclusion in the dictionary. In the usual mode of expression, symbols may be constructed from a set of 50 characters. If encoding were character by character, six bits would be required for the representation of each such character. As a symbol may contain as many as six characters, a total of 36 bits would be required for the representation of each symbol. This might seem convenient, as the length of a 709 word is exactly 36 bits, but a moment's consideration shows that it is unfortunate as it would be desirable to have a bit or two available in the same word as the symbol representation, giving a clue to the nature of the symbol. These flagging bits can be obtained. Let each character possible represent a digit in a number system having a base of fifty. Now six character symbols may be read as natural numbers in a base fifty system. If these numbers are converted to the usual base two system, only 34 bits are required for the maximum number and a gain of two flag bits has been made. This has the incidental feature of decreasing the requisite number of bits for representing the entire code, but conversion time would outweigh the saving by a significant margin were it not for the peculiar length of the 709 word. Here is a clear illustration of the critical effect the precise specifications of the machine concerned hold over the details of an encoding schema. […]`{{cite journal}}`: CS1 maint: unfit URL (link) (7 pages)
7. ^ Shell, Donald L. (April 1959) [October 1958]. "The SHARE 709 System: A Cooperative Effort". Journal of the ACM. 6 (2): 123–127. doi:10.1145/320964.320966. S2CID 16476514. Archived from the original on 2020-06-16. Retrieved 2020-06-16. (5 pages)
8. ^ "8.10 .RAD50". PAL-11R Assembler - Programmer's Manual - Program Assembly Language and Relocatable Assembler for the Disk Operating System (2nd revised printing ed.). Maynard, Massachusetts, USA: Digital Equipment Corporation. May 1971 [February 1971]. p. 8-8. DEC-11-ASDB-D. Retrieved 2020-06-18. p. 8-8: […] PDP-11 systems programs often handle symbols in a specially coded form called RADIX 50 (this form is sometimes referred to as MOD40). This form allows 3 characters to be packed into 16 bits […] [3]