Skip to content

Handle multiline strings #1399

Closed
Closed
@CohenArthur

Description

@CohenArthur

Multiline strings are allowed in Rust (playground link), however we currently do not handle them correctly:

test.rs:2:26: error: unended string literal
    2 |     let a = "whaaaaaat up
      |                          ^

This is the beginning of a patch to fix that, basically commenting the checks for a \n character:

diff --git a/gcc/rust/lex/rust-lex.cc b/gcc/rust/lex/rust-lex.cc
index ecf151dc778..c51b00fb5fe 100644
--- a/gcc/rust/lex/rust-lex.cc
+++ b/gcc/rust/lex/rust-lex.cc
@@ -1917,7 +1917,7 @@ Lexer::parse_string (Location loc)
   int length = 1;
   current_char32 = peek_codepoint_input ();
 
-  while (current_char32.value != '\n' && current_char32.value != '"')
+  while (/* current_char32.value != '\n' && */ current_char32.value != '"')
     {
       if (current_char32.value == '\\')
 	{
@@ -1949,14 +1949,15 @@ Lexer::parse_string (Location loc)
 
   current_column += length;
 
-  if (current_char32.value == '\n')
-    {
-      rust_error_at (get_current_location (), "unended string literal");
-      // by this point, the parser will stuck at this position due to
-      // undetermined string termination. we now need to unstuck the parser
-      skip_broken_string_input (current_char32.value);
-    }
-  else if (current_char32.value == '"')
+  // if (current_char32.value == '\n')
+  //   {
+  //     rust_error_at (get_current_location (), "unended string literal");
+  //     // by this point, the parser will stuck at this position due to
+  //     // undetermined string termination. we now need to unstuck the parser
+  //     skip_broken_string_input (current_char32.value);
+  //   }
+  if (current_char32.value == '"')
+    // else if (current_char32.value == '"')
     {
       current_column++;
 

However, that code is necessary for properly handling some documentation attributes, as pointed out by various test cases in our testsuite.

rustc does this in a different pass rather than the lexer, which is what I think we should do as well. We could for example add that check after parsing a doc_attr.

Here is the relevant rustc code which checks for certain characters:

                        if let Some(c) = doc_alias
                            .chars()
                            .find(|&c| c == '"' || c == '\'' || (c.is_whitespace() && c != ' '))
                        {
                            self.tcx
                                .sess
                                .struct_span_err(
                                    meta.span(),
                                    &format!(
                                        "{:?} character isn't allowed in `#[doc(alias = \"...\")]`",
                                        c,
                                    ),
                                )
                                .emit();
                            return false;
                        }

This issue is necessary for compiling certain versions of libcore properly, which do contain multiline strings.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Done

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions