admin管理员组

文章数量:1434792

The source csv file is:

123456TextValue1
654321TextValue2

where 123456 and TextValue1 are different values, separated by binary delimiter (\u0001)

Similary 654321 TextValue2

I use ConvertRecord for updating the delimiter from "\u0001" to ";"

RecordReader is CSVReader with the following properties:

  • Schema Access Strategy: Use 'Schema Text' Property
  • Schema Text: #{text_schema}
  • Value Separator: \u0001
  • Treat First Line as Header: false
  • Ignore CSV Header Column Names: true

RecordWriter is CSVRecordSetWriter:

  • Schema Access Strategy: Use 'Schema Text' Property
  • Schema Text: #{text_schema}
  • Value Separator: ;
  • Include Header Line: true

text_schema is

 {
  "type": "record",
  "name": "test_schema",
  "fields": [
    {
      "name": "FIELD_1",
      "type": ["int","null"],
    "description": "FIELD_1"
  },
  {
    "name": "FIELD_2",
    "type": ["string","null"],
    "description": "FIELD_2"
  }
 ]
}

Expected output is:

   FIELD_1;FIELD_2;
   123456;TextValue1
   654321;TextValue2

But I got the following error:

ERROR
ConvertRecord[id=01931001-0d7e-1e43-146d-1a380e6d43b7] Failed     to process FlowFile[filename=7365b509-7100-4bc2-a070-    4cc8ce8377b9]; will route to failure:     .apache.nifi.processor.exception.ProcessException: Could not     parse incoming data
   Caused by:                   .apache.nifi.serialization.MalformedRecordException: Error                  while getting next record
  Caused by: java.lang.NumberFormatException: For input string:       "123456TextValue1"

本文标签: apache nifiHow to read csv with binary delimiterStack Overflow