Csv 编解码器插件编辑

  • 插件版本: v1.1.0
  • 发布日期: 2021-07-28
  • 变更日志

有关其他版本,请参阅 版本化插件文档.

安装编辑

对于默认情况下未捆绑的插件,可以通过运行 bin/logstash-plugin install logstash-codec-csv 来轻松安装。有关更多详细信息,请参阅 使用插件.

获取帮助编辑

如果您对插件有任何疑问,请在 讨论 论坛中发布主题。对于错误或功能请求,请在 Github 中打开问题。有关 Elastic 支持的插件列表,请参阅 Elastic 支持矩阵.

描述编辑

csv 编解码器接收 CSV 数据,解析它并将其传递下去。

与 Elastic 通用模式 (ECS) 的兼容性编辑

该插件的行为与 ECS 兼容性无关,只是在启用 ECS 且 target 未设置时会发出警告。

设置 target 选项以避免潜在的模式冲突。

Csv 编解码器配置选项编辑

设置 输入类型 必需

autodetect_column_names

布尔值

autogenerate_column_names

布尔值

字符集

字符串,其中之一为 ["ASCII-8BIT", "UTF-8", "US-ASCII", "Big5", "Big5-HKSCS", "Big5-UAO", "CP949", "Emacs-Mule", "EUC-JP", "EUC-KR", "EUC-TW", "GB2312", "GB18030", "GBK", "ISO-8859-1", "ISO-8859-2", "ISO-8859-3", "ISO-8859-4", "ISO-8859-5", "ISO-8859-6", "ISO-8859-7", "ISO-8859-8", "ISO-8859-9", "ISO-8859-10", "ISO-8859-11", "ISO-8859-13", "ISO-8859-14", "ISO-8859-15", "ISO-8859-16", "KOI8-R", "KOI8-U", "Shift_JIS", "UTF-16BE", "UTF-16LE", "UTF-32BE", "UTF-32LE", "Windows-31J", "Windows-1250", "Windows-1251", "Windows-1252", "IBM437", "IBM737", "IBM775", "CP850", "IBM852", "CP852", "IBM855", "CP855", "IBM857", "IBM860", "IBM861", "IBM862", "IBM863", "IBM864", "IBM865", "IBM866", "IBM869", "Windows-1258", "GB1988", "macCentEuro", "macCroatian", "macCyrillic", "macGreek", "macIceland", "macRoman", "macRomania", "macThai", "macTurkish", "macUkraine", "CP950", "CP951", "IBM037", "stateless-ISO-2022-JP", "eucJP-ms", "CP51932", "EUC-JIS-2004", "GB12345", "ISO-2022-JP", "ISO-2022-JP-2", "CP50220", "CP50221", "Windows-1256", "Windows-1253", "Windows-1255", "Windows-1254", "TIS-620", "Windows-874", "Windows-1257", "MacJapanese", "UTF-7", "UTF8-MAC", "UTF-16", "UTF-32", "UTF8-DoCoMo", "SJIS-DoCoMo", "UTF8-KDDI", "SJIS-KDDI", "ISO-2022-JP-KDDI", "stateless-ISO-2022-JP-KDDI", "UTF8-SoftBank", "SJIS-SoftBank", "BINARY", "CP437", "CP737", "CP775", "IBM850", "CP857", "CP860", "CP861", "CP862", "CP863", "CP864", "CP865", "CP866", "CP869", "CP1258", "Big5-HKSCS:2008", "ebcdic-cp-us", "eucJP", "euc-jp-ms", "EUC-JISX0213", "eucKR", "eucTW", "EUC-CN", "eucCN", "CP936", "ISO2022-JP", "ISO2022-JP2", "ISO8859-1", "ISO8859-2", "ISO8859-3", "ISO8859-4", "ISO8859-5", "ISO8859-6", "CP1256", "ISO8859-7", "CP1253", "ISO8859-8", "CP1255", "ISO8859-9", "CP1254", "ISO8859-10", "ISO8859-11", "CP874", "ISO8859-13", "CP1257", "ISO8859-14", "ISO8859-15", "ISO8859-16", "CP878", "MacJapan", "ASCII", "ANSI_X3.4-1968", "646", "CP65000", "CP65001", "UTF-8-MAC", "UTF-8-HFS", "UCS-2BE", "UCS-4BE", "UCS-4LE", "CP932", "csWindows31J", "SJIS", "PCK", "CP1250", "CP1251", "CP1252", "external", "locale"]

数组

转换

哈希

ecs_compatibility

字符串

include_headers

布尔值

quote_char

字符串

分隔符

字符串

skip_empty_columns

布尔值

目标

字符串

 

autodetect_column_names编辑

定义是否应从标题列自动检测列名。默认为 false。

autogenerate_column_names编辑

定义是否应自动生成列名。默认为 true。如果设置为 false,则没有指定标题的列将不会被解析。

charset编辑

  • 值可以是以下任何一个:ASCII-8BITUTF-8US-ASCIIBig5Big5-HKSCSBig5-UAOCP949Emacs-MuleEUC-JPEUC-KREUC-TWGB2312GB18030GBKISO-8859-1ISO-8859-2ISO-8859-3ISO-8859-4ISO-8859-5ISO-8859-6ISO-8859-7ISO-8859-8ISO-8859-9ISO-8859-10ISO-8859-11ISO-8859-13ISO-8859-14ISO-8859-15ISO-8859-16KOI8-RKOI8-UShift_JISUTF-16BEUTF-16LEUTF-32BEUTF-32LEWindows-31JWindows-1250Windows-1251Windows-1252IBM437IBM737IBM775CP850IBM852CP852IBM855CP855IBM857IBM860IBM861IBM862IBM863IBM864IBM865IBM866IBM869Windows-1258GB1988macCentEuromacCroatianmacCyrillicmacGreekmacIcelandmacRomanmacRomaniamacThaimacTurkishmacUkraineCP950CP951IBM037stateless-ISO-2022-JPeucJP-msCP51932EUC-JIS-2004GB12345ISO-2022-JPISO-2022-JP-2CP50220CP50221Windows-1256Windows-1253Windows-1255Windows-1254TIS-620Windows-874Windows-1257MacJapaneseUTF-7UTF8-MACUTF-16UTF-32UTF8-DoCoMoSJIS-DoCoMoUTF8-KDDISJIS-KDDIISO-2022-JP-KDDIstateless-ISO-2022-JP-KDDIUTF8-SoftBankSJIS-SoftBankBINARYCP437CP737CP775IBM850CP857CP860CP861CP862CP863CP864CP865CP866CP869CP1258Big5-HKSCS:2008ebcdic-cp-useucJPeuc-jp-msEUC-JISX0213eucKReucTWEUC-CNeucCNCP936ISO2022-JPISO2022-JP2ISO8859-1ISO8859-2ISO8859-3ISO8859-4ISO8859-5ISO8859-6CP1256ISO8859-7CP1253ISO8859-8CP1255ISO8859-9CP1254ISO8859-10ISO8859-11CP874ISO8859-13CP1257ISO8859-14ISO8859-15ISO8859-16CP878MacJapanASCIIANSI_X3.4-1968646CP65000CP65001UTF-8-MACUTF-8-HFSUCS-2BEUCS-4BEUCS-4LECP932csWindows31JSJISPCKCP1250CP1251CP1252externallocale
  • 默认值为 "UTF-8"

此编解码器中使用的字符编码。示例包括“UTF-8”和“CP1252”。

columns编辑

  • 值类型为 数组
  • 默认值为 []

解码时: 定义列名列表(按它们在 CSV 中出现的顺序,就像标题行一样)。如果未配置 columns,或者未指定足够的列,则默认列名为“column1”、“column2”等。

编码时: 要包含在编码的 CSV 中的字段名列表,按列出的顺序。

convert编辑

  • 值类型为 哈希
  • 默认值为 {}

定义要应用于列的一组数据类型转换。可能的转换有:integerfloatdatedate_timeboolean

示例

    filter {
      csv {
        convert => { "column1" => "integer", "column2" => "boolean" }
      }
    }

ecs_compatibility编辑

  • 值类型为 字符串
  • 支持的值为

    • disabled:在根级别添加 CSV 数据
    • v1v8:符合 Elastic 通用模式的行为([event][original] 也被添加)
  • 默认值取决于正在运行的 Logstash 版本

    • 当 Logstash 提供 pipeline.ecs_compatibility 设置时,其值用作默认值
    • 否则,默认值为 disabled

控制此插件与 Elastic Common Schema (ECS) 的兼容性。

include_headers编辑

在输出插件中,当 编码 时,在每个编解码器生命周期(不是每个事件)中包含 CSV 标头。默认 ⇒ false

quote_char编辑

定义用于引用 CSV 字段的字符。如果未指定,则默认值为双引号 "。可选。

separator编辑

定义列分隔符值。如果未指定,则默认值为逗号 ,。可选。

skip_empty_columns编辑

定义是否应跳过空列。默认为 false。如果设置为 true,则不包含不包含值的列。

target编辑

  • 值类型为 字符串
  • 此设置没有默认值。

定义用于放置行值的 target 字段。如果未设置此设置,则 CSV 数据将存储在事件的根(顶层)处。

例如,如果您希望数据放在 document 字段下

    input {
      file {
        codec => csv {
          autodetect_column_names => true
          target => "[document]"
        }
      }
    }