将转义序列从用户输入转换为它们的实际表示形式

时间:2021-10-18 00:14:07

I'm trying to write an interpreter for LOLCODE that reads escaped strings from a file in the form:

我正在为LOLCODE编写一个解释器,它从表单中的文件中读取转义字符串:

VISIBLE "HAI \" WORLD!"

For which I wish to show an output of:

为此,我谨表示:

HAI " WORLD!

I have tried to dynamically generate a format string for printf in order to do this, but it seems that the escaping is done at the stage of declaration of a string literal.

我尝试动态地为printf生成一个格式字符串,以实现这一点,但似乎是在字符串文本声明的阶段完成了转义。

In essence, what I am looking for is exactly the opposite of this question: Convert characters in a c string to their escape sequences

本质上,我正在寻找的正是与这个问题相反的东西:将c字符串中的字符转换为它们的转义序列

Is there any way to go about this?

有什么办法吗?

1 个解决方案

#1


3  

It's a pretty standard scanning exercise. Depending on how close you intend to be to the LOLCODE specification (which I can't seem to reach right now, so this is from memory), you've got a few ways to go.

这是一个相当标准的扫描练习。根据您想要接近LOLCODE规范的程度(我现在似乎无法达到,所以这是来自内存的),您有一些方法。

Write a lexer by hand

It's not as hard as it sounds. You just want to analyze your input one character at a time, while maintaining a bit of context information. In your case, the important context consists of two flags:

这并不像听起来那么难。您只需要一次分析一个输入字符,同时维护一些上下文信息。在您的例子中,重要的上下文由两个标志组成:

  • one to remember you're currently lexing a string. It'll be set when reading " and cleared when reading ".
  • 需要记住的是,你现在正在对一个字符串进行lexing。阅读时设置为“阅读时清除”。
  • one to remember the previous character was an escape. It'll be set when reading \ and cleared when reading the character after that, no matter what it is.
  • 要记住以前的角色是逃跑。它将在读取\时被设置,在读取字符后被清除,无论它是什么。

Then the general algorithm looks like: (pseudocode)

一般的算法是这样的:(伪代码)

loop on: c ← read next character
  if not inString 
    if c is '"' then clear buf; set inString
    else [out of scope here]
  if inEscape then append c to buf; clear inEscape
  if c is '"' then return buf as result; clear inString
  if c is '\' then set inEscape
  else append c to buf

You might want to refine the inEscape case should you want to implement \r, \n and the like.

如果要实现\r \n等等,您可能需要改进inEscape案例。

Use a lexer generator

The traditional tools here are lex and flex.

这里的传统工具是lex和flex。

Get inspiration

You're not the first one to write a LOLCODE interpreter. There's nothing wrong with peeking at how the others did it. For example, here's the string parsing code from lci.

您不是第一个编写LOLCODE解释器的人。偷看别人是怎么做的没有什么错。例如,这里是来自lci的字符串解析代码。

#1


3  

It's a pretty standard scanning exercise. Depending on how close you intend to be to the LOLCODE specification (which I can't seem to reach right now, so this is from memory), you've got a few ways to go.

这是一个相当标准的扫描练习。根据您想要接近LOLCODE规范的程度(我现在似乎无法达到,所以这是来自内存的),您有一些方法。

Write a lexer by hand

It's not as hard as it sounds. You just want to analyze your input one character at a time, while maintaining a bit of context information. In your case, the important context consists of two flags:

这并不像听起来那么难。您只需要一次分析一个输入字符,同时维护一些上下文信息。在您的例子中,重要的上下文由两个标志组成:

  • one to remember you're currently lexing a string. It'll be set when reading " and cleared when reading ".
  • 需要记住的是,你现在正在对一个字符串进行lexing。阅读时设置为“阅读时清除”。
  • one to remember the previous character was an escape. It'll be set when reading \ and cleared when reading the character after that, no matter what it is.
  • 要记住以前的角色是逃跑。它将在读取\时被设置,在读取字符后被清除,无论它是什么。

Then the general algorithm looks like: (pseudocode)

一般的算法是这样的:(伪代码)

loop on: c ← read next character
  if not inString 
    if c is '"' then clear buf; set inString
    else [out of scope here]
  if inEscape then append c to buf; clear inEscape
  if c is '"' then return buf as result; clear inString
  if c is '\' then set inEscape
  else append c to buf

You might want to refine the inEscape case should you want to implement \r, \n and the like.

如果要实现\r \n等等,您可能需要改进inEscape案例。

Use a lexer generator

The traditional tools here are lex and flex.

这里的传统工具是lex和flex。

Get inspiration

You're not the first one to write a LOLCODE interpreter. There's nothing wrong with peeking at how the others did it. For example, here's the string parsing code from lci.

您不是第一个编写LOLCODE解释器的人。偷看别人是怎么做的没有什么错。例如,这里是来自lci的字符串解析代码。