处理中间状态的列表

问题描述：

我正在处理一个字符串列表，你可以将它们想象成一本书的行。当一条线是空的时，它必须被丢弃。当它是标题时，它被“保存”为当前标题。每个“正常”行必须生成一个包含文本和当前标题的对象。最后你有一系列的行，每行都有相应的标题。处理中间状态的列表

例：

- Chapter 1 

Lorem ipsum dolor sit amet 
consectetur adipisicing elit 

- Chapter 2 

sed do eiusmod tempor 
incididunt u

第一行是一个标题，第二线必须被丢弃，则两条线保持为段落，每个用“第1章”的标题。等等。你结束了类似的集合：

{"Lorem ipsum...", "Chapter 1"}, 
{"consectetur...", "Chapter 1"}, 
{"sed do...", "Chapter 2"}, 
{"incididunt ...", "Chater 2"}

我知道标题/段模型不使100％的感觉，但我简化模型来说明这个问题。

这是我的迭代求解：

let parseText allLines = 
    let mutable currentTitle = String.Empty 
    seq { 
     for line in allLines do 
      match parseLine line with 
      | Empty -> 0 |> ignore 
      | Title caption -> 
       currentTitle <- caption 
      | Body text -> 
        yield new Paragraph(currentTitle, text) 
    }

第一个问题是我不得不丢弃空行，我这样做是与0 |> ignore但它看起来很对我不好。什么是适当的做到这一点（没有预过滤列表）？

此功能的尾递归版本很简单：

let rec parseText allLines currentTitle paragraphs = 
    match allLines with 
    | [] -> paragraphs 
    | head :: tail -> 
     match head with 
     | Empty -> parseText tail currentTitle paragraphs 
     | Title caption -> parseText tail caption paragraphs 
     | Body text -> parseText tail currentTitle (new Paragraph(currentTitle, text) :: tail)

的问题（S）：

有两个版本（风格/性能之间的显著差异/等等）？
有没有更好的方法来解决这个问题？是否可以用一个List.map来完成它？

答

虽然不是一个单一的List.Map，他再次是解决方案，我想出了：

let parseText allLines = 
    allLines 
    |> Seq.fold (fun (currentTitle,paragraphs) line -> 
     match parseLine line with 
     | Empty -> currentTitle,paragraphs 
     | Title caption -> caption,paragraphs 
     | Body text -> String.Empty,Paragraph(currentTitle, text)::paragraphs 
     ) (String.Empty,[]) 
    |> snd

我使用的是倍(currentTitle,paragraphs)的状态。 snd用于提取结果（它是状态元组的一部分）。

当你在F＃中完成大部分处理时，使用列表很有吸引力，但其他数据结构，甚至普通序列都有它们的用途。

顺便说一句，你的序列代码编译？我必须用currentTitle = ref String.Empty替换mutable currentTitle = String.Empty。

现在这是非常好的！ –

答

您可以将0 |> ignore替换为()（单位），这是一个无操作。你的两个实现最大的区别是第一个是懒惰的，这对于大量输入可能是有用的。

下也可能会为你工作（这是我能想到的最简单的解决方案）：

let parseText (lines:seq<string>) = 
    lines 
    |> Seq.filter (fun line -> line.Trim().Length > 0) 
    |> Seq.pairwise (fun (title, body) -> Paragraph(title, body))

如果没有，也许这将工作：

let parseText (lines:seq<string>) = 
    lines 
    |> Seq.choose (fun line -> 
    match line.Trim() with 
    | "" | null -> None 
    | Title title -> Some title 
    | Body text -> Some text) 
    |> Seq.pairwise (fun (title, body) -> Paragraph(title, body))

我想Seq.pairwise不会这样做，因为我可以在下一个标题之前有n行文本。对不起，如果我没有在问题中说清楚。 –

单位+1而不是0 |>忽略。谢谢！ –

@Francesco：在这种情况下，我认为你的可变解决方案和它所获得的一样好（对于序列）。功能解决方案将更长，可读性更差。 – Daniel

答

下面是一个这样的实现（虽然没有测试过，但我希望它给你的想法）

let isNotEmpty l = match l with 
        | Empty -> false 
        | _ -> true 

let parseText allLines = 
    allLines |> Seq.map parseLine |> Seq.filter isNotEmpty 
    |> Seq.scan (fun (c,t,b) i -> match i with 
            | Title tl -> (0,tl,"") 
            | Body bb -> (1,t,bb) 
            | _ -> (0,t,b)) (0,"","") 
    |> Seq.filter (fun (c,_,_) -> c > 0) 
    |> Seq.map (fun (_,t,b) -> Paragraph(t,b))

不错！我觉得这个版本不太可读，但对于学习还是很有趣的。 –

处理中间状态的列表

相关推荐