Hive SerDe for CSV
Go to file
Ken Williams 478a6b1941 Tell the user what the default quote/escape/sep characters are
Since there are various versions of the CSV "standard" floating around - most notably how embedded quotes are escaped.
2014-02-14 17:47:46 -06:00
src move hive/hadoop deps to provided 2013-10-30 22:56:07 -07:00
.gitignore remove .classpath/.project 2013-10-30 12:32:12 -07:00
LICENSE.txt apache 2 license 2011-11-16 11:59:46 -08:00
pom.xml move hive/hadoop deps to provided 2013-10-30 22:56:07 -07:00
readme.md Tell the user what the default quote/escape/sep characters are 2014-02-14 17:47:46 -06:00

readme.md

Hive CSV Support

Build Status

This SerDe adds real CSV input and ouput support to hive using the excellent opencsv library.

Using

Basic Use

add jar path/to/csv-serde.jar;

create table my_table(a string, b string, ...)
  row format serde 'com.bizo.hive.serde.csv.CSVSerde'
  stored as textfile
;

Custom formatting

The default separator, quote, and escape characters from the opencsv library are:

DEFAULT_ESCAPE_CHARACTER \
DEFAULT_QUOTE_CHARACTER  "
DEFAULT_SEPARATOR        ,

You can also specify custom separator, quote, or escape characters.

add jar path/to/csv-serde.jar;

create table my_table(a string, b string, ...)
 row format serde 'com.bizo.hive.serde.csv.CSVSerde'
 with serdeproperties (
   "separatorChar" = "\t",
   "quoteChar"     = "'",
   "escapeChar"    = "\\"
  )	  
 stored as textfile
;

Files

The following include opencsv along with the serde, so only the single jar is needed. Currently built against Hive 0.11.0, but should be compatible with other hive versions.

Building

Run mvn package to build. Both a basic artifact as well as a "fat jar" (with opencsv) are produced.

Eclipse support

Run mvn eclipse:eclipse to generate .project and .classpath files for eclipse.

License

csv-serde is open source and licensed under the Apache 2 License.