I'm trying to write an interpreter for LOLCODE that reads escaped strings from a file in the form:
我正在为LOLCODE编写一个解释器,它从表单中的文件中读取转义字符串:
VISIBLE "HAI \" WORLD!"
For which I wish to show an output of:
为此,我谨表示:
HAI " WORLD!
I have tried to dynamically generate a format string for printf in order to do this, but it seems that the escaping is done at the stage of declaration of a string literal.
我尝试动态地为printf生成一个格式字符串,以实现这一点,但似乎是在字符串文本声明的阶段完成了转义。
In essence, what I am looking for is exactly the opposite of this question: Convert characters in a c string to their escape sequences
本质上,我正在寻找的正是与这个问题相反的东西:将c字符串中的字符转换为它们的转义序列
Is there any way to go about this?
有什么办法吗?
1 个解决方案
#1
3
It's a pretty standard scanning exercise. Depending on how close you intend to be to the LOLCODE specification (which I can't seem to reach right now, so this is from memory), you've got a few ways to go.
这是一个相当标准的扫描练习。根据您想要接近LOLCODE规范的程度(我现在似乎无法达到,所以这是来自内存的),您有一些方法。
Write a lexer by hand
It's not as hard as it sounds. You just want to analyze your input one character at a time, while maintaining a bit of context information. In your case, the important context consists of two flags:
这并不像听起来那么难。您只需要一次分析一个输入字符,同时维护一些上下文信息。在您的例子中,重要的上下文由两个标志组成:
- one to remember you're currently lexing a string. It'll be set when reading
"
and cleared when reading"
. - 需要记住的是,你现在正在对一个字符串进行lexing。阅读时设置为“阅读时清除”。
- one to remember the previous character was an escape. It'll be set when reading
\
and cleared when reading the character after that, no matter what it is. - 要记住以前的角色是逃跑。它将在读取\时被设置,在读取字符后被清除,无论它是什么。
Then the general algorithm looks like: (pseudocode)
一般的算法是这样的:(伪代码)
loop on: c ← read next character
if not inString
if c is '"' then clear buf; set inString
else [out of scope here]
if inEscape then append c to buf; clear inEscape
if c is '"' then return buf as result; clear inString
if c is '\' then set inEscape
else append c to buf
You might want to refine the inEscape
case should you want to implement \r
, \n
and the like.
如果要实现\r \n等等,您可能需要改进inEscape案例。
Use a lexer generator
The traditional tools here are lex and flex.
这里的传统工具是lex和flex。
Get inspiration
You're not the first one to write a LOLCODE interpreter. There's nothing wrong with peeking at how the others did it. For example, here's the string parsing code from lci.
您不是第一个编写LOLCODE解释器的人。偷看别人是怎么做的没有什么错。例如,这里是来自lci的字符串解析代码。
#1
3
It's a pretty standard scanning exercise. Depending on how close you intend to be to the LOLCODE specification (which I can't seem to reach right now, so this is from memory), you've got a few ways to go.
这是一个相当标准的扫描练习。根据您想要接近LOLCODE规范的程度(我现在似乎无法达到,所以这是来自内存的),您有一些方法。
Write a lexer by hand
It's not as hard as it sounds. You just want to analyze your input one character at a time, while maintaining a bit of context information. In your case, the important context consists of two flags:
这并不像听起来那么难。您只需要一次分析一个输入字符,同时维护一些上下文信息。在您的例子中,重要的上下文由两个标志组成:
- one to remember you're currently lexing a string. It'll be set when reading
"
and cleared when reading"
. - 需要记住的是,你现在正在对一个字符串进行lexing。阅读时设置为“阅读时清除”。
- one to remember the previous character was an escape. It'll be set when reading
\
and cleared when reading the character after that, no matter what it is. - 要记住以前的角色是逃跑。它将在读取\时被设置,在读取字符后被清除,无论它是什么。
Then the general algorithm looks like: (pseudocode)
一般的算法是这样的:(伪代码)
loop on: c ← read next character
if not inString
if c is '"' then clear buf; set inString
else [out of scope here]
if inEscape then append c to buf; clear inEscape
if c is '"' then return buf as result; clear inString
if c is '\' then set inEscape
else append c to buf
You might want to refine the inEscape
case should you want to implement \r
, \n
and the like.
如果要实现\r \n等等,您可能需要改进inEscape案例。
Use a lexer generator
The traditional tools here are lex and flex.
这里的传统工具是lex和flex。
Get inspiration
You're not the first one to write a LOLCODE interpreter. There's nothing wrong with peeking at how the others did it. For example, here's the string parsing code from lci.
您不是第一个编写LOLCODE解释器的人。偷看别人是怎么做的没有什么错。例如,这里是来自lci的字符串解析代码。