Menu

#722 build failures --without-unicode

build problems
open
nobody
None
5
2017-12-07
2017-12-06
No

The configuration --without-unicode currently manages to generate a lisp.run and lispinit.mem, but no link kits (neither base nor full) and it aborts during tests -- at least on MacOSX 10.6.8 32 bit.

A) make check stops in ext-clisp.tst

*** - READ from #<INPUT BUFFERED FILE-STREAM CHARACTER #P"ext-clisp.tst" @477>
      : illegal character #\‡
Line 477 contains:
 (eq *current-language* 'FRANÇAIS)))

[10]> "FRANÇAIS" ; the terminal is still using unicode and feeds UTF-8 bytes to the process
"FRANÇAIS"
[11]> 'FRANÇAIS
*** - READ from #<INPUT CONCATENATED-STREAM #<INPUT STRING-INPUT-STREAM> #<IO TERMINAL-STREAM>>: illegal character #\?

(ext:convert-string-to-bytes "FRANÇAIS" charset:utf-8)
#(70 82 65 78 195 135 65 73 83)

While that build can read e.g. äöü (sent as UTF-8 sequence), it perhaps rejects the code 135 (&0x7f < 32).
Likewise, РУССКИЙ causes the reader to error out.

These symbols appear within #+:gettext in the tests, but this build is without gettext (pristine MacOS): read-time conditionals are no help.

A fix could be

(string= *current-language* "FRANÇAIS")

or (equal (symbol-name *current-language*) "...") if we don't want people to believe *current-language* might be a string.

B) Most modules refer to variables like GLO(misc_encoding) or one of the other encodings. The macro with_string*() needs an encoding. Alas, those are not defined in a build --without-unicode.

Some references to encodings are harmless, they are ignored. For instance, string_to_asciz() is unproblematic, witness clisp.h:

#define string_to_asciz(obj,encoding)  string_to_asciz_(obj)

The regexp module does not compile because of references to encodings.

Discussion

  • Jörg Höhle

    Jörg Höhle - 2017-12-06
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -1,4 +1,4 @@
    -The configuration --without-unicode currently manages to generate a lisp.run and lispinit.mem, but no link kits (neither base nor full) and if aborts during tests (at least on MacOSX 10.6.8 32 bit).
    +The configuration --without-unicode currently manages to generate a lisp.run and lispinit.mem, but no link kits (neither base nor full) and it aborts during tests -- at least on MacOSX 10.6.8 32 bit.
    
     A) make check stops in ext-clisp.tst
     \*** - READ from #<INPUT BUFFERED FILE-STREAM CHARACTER #P"ext-clisp.tst" @477>
    @@ -15,13 +15,18 @@
     \#(70 82 65 78 195 135 65 73 83)
    
     While that build can read e.g. 'äöü (sent as UTF-8 sequence), it perhaps rejects the code 135 (&0x7f < 32).
    -Likewise, 'РУССКИЙ causes the read to error out.
    +Likewise, 'РУССКИЙ causes the reader to error out.
    
     These symbols appear within #+:gettext in the tests, but this build is without gettext (pristine MacOS): read-time conditionals are no help.
    +
    +A fix could be
    +(string= *current-language* "FRANÇAIS")
    +or (equal (symbol-name *current-language*) "...") if we don't want people to believe *current-language* might be a string.
    +
    
     B) Most modules refer to variables like GLO(misc_encoding) or one of the other encodings. The macro with_string\*() needs an encoding. Alas, those are not defined in a build --without-unicode.
    
     Some references to encodings are harmless, they are ignored. For instance, string_to_asciz() is unproblematic, witness clisp.h:
     \#define string_to_asciz(obj,encoding)  string_to_asciz_(obj)
    
    -Currently the regexp module currently does not compile because of references to encodings (at least on MacOSX 32 bit).
    +The regexp module does not compile because of references to encodings.
    
     
  • Sam Steingold

    Sam Steingold - 2017-12-06
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -1,7 +1,8 @@
     The configuration --without-unicode currently manages to generate a lisp.run and lispinit.mem, but no link kits (neither base nor full) and it aborts during tests -- at least on MacOSX 10.6.8 32 bit.
    
     A) make check stops in ext-clisp.tst
    -\*** - READ from #<INPUT BUFFERED FILE-STREAM CHARACTER #P"ext-clisp.tst" @477>
    +```
    +*** - READ from #<INPUT BUFFERED FILE-STREAM CHARACTER #P"ext-clisp.tst" @477>
           : illegal character #\‡
     Line 477 contains:
      (eq *current-language* 'FRANÇAIS)))
    @@ -12,21 +13,24 @@
     *** - READ from #<INPUT CONCATENATED-STREAM #<INPUT STRING-INPUT-STREAM> #<IO TERMINAL-STREAM>>: illegal character #\?
    
     (ext:convert-string-to-bytes "FRANÇAIS" charset:utf-8)
    -\#(70 82 65 78 195 135 65 73 83)
    -
    -While that build can read e.g. 'äöü (sent as UTF-8 sequence), it perhaps rejects the code 135 (&0x7f < 32).
    -Likewise, 'РУССКИЙ causes the reader to error out.
    +#(70 82 65 78 195 135 65 73 83)
    +```
    +While that build can read e.g. `äöü` (sent as `UTF-8` sequence), it perhaps rejects the code 135 (&0x7f < 32).
    +Likewise, `РУССКИЙ` causes the reader to error out.
    
     These symbols appear within #+:gettext in the tests, but this build is without gettext (pristine MacOS): read-time conditionals are no help.
    
     A fix could be
    +```
     (string= *current-language* "FRANÇAIS")
    -or (equal (symbol-name *current-language*) "...") if we don't want people to believe *current-language* might be a string.
    +```
    +or `(equal (symbol-name *current-language*) "...")` if we don't want people to believe `*current-language*` might be a string.
    
    
    -B) Most modules refer to variables like GLO(misc_encoding) or one of the other encodings. The macro with_string\*() needs an encoding. Alas, those are not defined in a build --without-unicode.
    +B) Most modules refer to variables like `GLO(misc_encoding)` or one of the other encodings. The macro `with_string*()` needs an encoding. Alas, those are not defined in a build `--without-unicode`.
    
    -Some references to encodings are harmless, they are ignored. For instance, string_to_asciz() is unproblematic, witness clisp.h:
    -\#define string_to_asciz(obj,encoding)  string_to_asciz_(obj)
    -
    +Some references to encodings are harmless, they are ignored. For instance, `string_to_asciz()` is unproblematic, witness `clisp.h`:
    +```
    +#define string_to_asciz(obj,encoding)  string_to_asciz_(obj)
    +```
     The regexp module does not compile because of references to encodings.
    
     
  • Sam Steingold

    Sam Steingold - 2017-12-06

    Jorg, please use the "eye" button (preview) and the "?" button (help) when editing bug reports.
    thanks!

     
  • Jörg Höhle

    Jörg Höhle - 2017-12-07

    With the attached patch, make check-tests check-sacla-tests passes (minus bug #720).

    Ishould submit a bug report to SF because their preview does not always match the final outcome...

     

Log in to post a comment.