shlex for Racket: Simple lexical analysis
(require shlex) | package: shlex |
This library is a port of Python’s shlex library, originally implemented by Eric S. Raymond, Gustavo Niemeyer, Vinay Sajip, and other contributors. The library allows users to call system-like functions (e.g., system, process) safely, avoiding the shell injection attack. On the other direction, it allows users to convert arguments of system-like functions to a format that can be used with (safe) system*-like functions (e.g., system*, process*).
Note, however, that this library differs from the Python’s library. It only supports split (shlex.split), join (shlex.join), and quote-arg (shlex.quote) with limited customization. The implementation of split is based directly on the specification of the Shell Command Language in the Open Group Base Specifications Issue 7, 2018 edition, rather than Python’s implementation.
1 Functions
procedure
s : (or/c string? input-port?) comment? : any/c = #t
When comment? is not #f, line comments via the character # are supported.
If there is an unterminated quote or escape sequence, exn:fail:read:eof will be raised.
The results in particular can be used with system*-like functions.
Notably, the function passes all but one tests in Python’s test suite. The discrepancy is due to how Python handles comments incorrectly.
> (split "echo -n 'Multiple words'") '("echo" "-n" "Multiple words")
> (split "echo \"abc \\\"123\\\" def\" ghi") '("echo" "abc \"123\" def" "ghi")
> (split "ls#b") ; Python is wrong in this example, outputting ["ls"] '("ls#b")
> (split "ls #b\nsome-file") '("ls" "some-file")
> (split "ls #b\nsome-file" #:comment? #f) '("ls" "#b" "some-file")
> (quote-arg "somefile; rm -rf ~") "'somefile; rm -rf ~'"
2 Safety
It might be tempting to write code as follows:
(define (ls-unsafe arg) (system (format "ls ~a" arg)))
However, the above code has a shell injection vulnerability. For example, when ls-unsafe is invoked with the argument "somefile; rm -rf ~", the argument to system would have the following value:
> (format "ls ~a" "somefile; rm -rf ~") "ls somefile; rm -rf ~"
which causes the home directory to be deleted.
By using quote-arg, one can avoid this attack:
(define (ls-safe arg) (system (format "ls ~a" (quote-arg arg))))
We can see that when ls-safe is called with "somefile; rm -rf ~", the argument is quoted properly, thus avoiding the attack.
> (format "ls ~a" (quote-arg "somefile; rm -rf ~")) "ls 'somefile; rm -rf ~'"