| draft-shafranovich-mime-csv-02.txt | draft-shafranovich-mime-csv-03.txt | |||
|---|---|---|---|---|
| Network Working Group Y. Shafranovich | Network Working Group Y. Shafranovich | |||
| Internet-Draft SolidMatrix Technologies, Inc. | Internet-Draft SolidMatrix Technologies, Inc. | |||
| Expires: August 22, 2005 February 18, 2005 | Expires: September 24, 2005 March 23, 2005 | |||
| Common Format and MIME Type for CSV Files | Common Format and MIME Type for CSV Files | |||
| draft-shafranovich-mime-csv-02.txt | draft-shafranovich-mime-csv-03.txt | |||
| Status of this Memo | Status of this Memo | |||
| This document is an Internet-Draft and is subject to all provisions | This document is an Internet-Draft and is subject to all provisions | |||
| of Section 3 of RFC 3667. By submitting this Internet-Draft, each | of Section 3 of RFC 3667. By submitting this Internet-Draft, each | |||
| author represents that any applicable patent or other IPR claims of | author represents that any applicable patent or other IPR claims of | |||
| which he or she is aware have been or will be disclosed, and any of | which he or she is aware have been or will be disclosed, and any of | |||
| which he or she become aware will be disclosed, in accordance with | which he or she become aware will be disclosed, in accordance with | |||
| RFC 3668. | RFC 3668. | |||
| skipping to change at page 1, line 35 | skipping to change at page 1, line 35 | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
| This Internet-Draft will expire on August 22, 2005. | This Internet-Draft will expire on September 24, 2005. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (C) The Internet Society (2005). | Copyright (C) The Internet Society (2005). | |||
| Abstract | Abstract | |||
| This document documents the format used for Comma-Separated Values | This document documents the format used for Comma-Separated Values | |||
| (CSV) files and registers the associated MIME type "text/csv". | (CSV) files and registers the associated MIME type "text/csv". | |||
| skipping to change at page 3, line 18 | skipping to change at page 3, line 18 | |||
| and converting data between various spreadsheet programs for quite | and converting data between various spreadsheet programs for quite | |||
| some time. Surprisingly, while this format is very common it has | some time. Surprisingly, while this format is very common it has | |||
| never been formally documented. Additionally, while the IANA MIME | never been formally documented. Additionally, while the IANA MIME | |||
| registration tree includes a registration for | registration tree includes a registration for | |||
| "text/tab-separated-values" type, no MIME types have ever been | "text/tab-separated-values" type, no MIME types have ever been | |||
| registered with IANA for CSV. At the same time, various programs and | registered with IANA for CSV. At the same time, various programs and | |||
| operating systems have begun to use different MIME types for this | operating systems have begun to use different MIME types for this | |||
| format, many of which vary from system to system. This document | format, many of which vary from system to system. This document | |||
| seeks to document the format of comma separated values (CSV) files | seeks to document the format of comma separated values (CSV) files | |||
| and to formally register the "text/csv" MIME type for CSV in | and to formally register the "text/csv" MIME type for CSV in | |||
| accordance with RFC 2048 [4]. | accordance with RFC 2048 [1]. | |||
| 2. Definition of the CSV format | 2. Definition of the CSV format | |||
| While there are various specifications and implementations for the | While there are various specifications and implementations for the | |||
| CSV format (for ex. [5], [6], [7] and [8]), no formal specification | CSV format (for ex. [4], [5], [6] and [7]), no formal specification | |||
| exists which causes a wide variety of interpretations for CSV files. | exists which causes a wide variety of interpretations for CSV files. | |||
| This section seeks to document the format that seems to be followed | This section seeks to document the format that seems to be followed | |||
| by most implementations: | by most implementations: | |||
| 1. Each record is located on a separate line delimited by a line | 1. Each record is located on a separate line delimited by a line | |||
| break (CRLF). For example: | break (CRLF). For example: | |||
| aaa,bbb,ccc CRLF | aaa,bbb,ccc CRLF | |||
| zzz,yyy,xxx CRLF | zzz,yyy,xxx CRLF | |||
| skipping to change at page 3, line 51 | skipping to change at page 3, line 51 | |||
| of the file with the same format as normal record lines. This | of the file with the same format as normal record lines. This | |||
| header will contain names corresponding to the fields in the file | header will contain names corresponding to the fields in the file | |||
| and will usually contain the same number of fields as the records | and will usually contain the same number of fields as the records | |||
| in the rest of the file. For example: | in the rest of the file. For example: | |||
| field_name,field_name,field_name CRLF | field_name,field_name,field_name CRLF | |||
| aaa,bbb,ccc CRLF | aaa,bbb,ccc CRLF | |||
| zzz,yyy,xxx CRLF | zzz,yyy,xxx CRLF | |||
| 4. Within the header and each record there may be one or more | 4. Within the header and each record there may be one or more | |||
| fields, delimited by commas. The last field in the record may or | fields, separated by commas. The last field in the record may | |||
| may not be followed by a comma. For example: | not be followed by a comma. For example: | |||
| aaa,bbb,ccc | aaa,bbb,ccc | |||
| 5. Each field may or may not be enclosed in double quotes (however | 5. Each field may or may not be enclosed in double quotes (however | |||
| some programs such as Microsoft Excel do not use double quotes at | some programs such as Microsoft Excel do not use double quotes at | |||
| all). For example: | all). For example: | |||
| "aaa","bbb","ccc" CRLF | "aaa","bbb","ccc" CRLF | |||
| zzz,yyy,xxx | zzz,yyy,xxx | |||
| 6. Field containing line breaks (CRLF) and commas should be enclosed | 6. Field containing line breaks (CRLF) and commas should be enclosed | |||
| in double-quotes. For example: | in double-quotes. For example: | |||
| "aaa","b CRLF | "aaa","b CRLF | |||
| bb","ccc" CRLF | bb","ccc" CRLF | |||
| zzz,yyy,xxx | zzz,yyy,xxx | |||
| 7. If double-quotes are used to enclosed fields, then double-quotes | 7. If double-quotes are used to enclosed fields, then a double-quote | |||
| inside fields must be surrounded by double quotes. For example: | appearing inside a field must be escaped by preceding it with | |||
| another double quote. For example: | ||||
| "aaa","b"""bb","ccc" | "aaa","b""bb","ccc" | |||
| The ABNF grammar [1] appears as follows: | The ABNF grammar [2] appears as follows: | |||
| file = [header CRLF] record *(CRLF record) [CRLF] | file = [header CRLF] record *(CRLF record) [CRLF] | |||
| header = name *(COMMA name) | header = name *(COMMA name) | |||
| record = field *(COMMA field) | record = field *(COMMA field) | |||
| name = field | name = field | |||
| field = (escaped / non-escaped) | field = (escaped / non-escaped) | |||
| escaped = DQUOTE *(VCHAR / CR / LF / CRLF / 3*DQUOTE) DQUOTE | escaped = DQUOTE *(VCHAR / CR / LF / CRLF / 2*DQUOTE) DQUOTE | |||
| non-escaped = *VCHAR | non-escaped = *VCHAR | |||
| COMMA = %x2C | COMMA = %x2C | |||
| CR = %x0D ;as per section 6.1 of RFC 2234 [1] | CR = %x0D ;as per section 6.1 of RFC 2234 [2] | |||
| DQUOTE = %x22;as per section 6.1 of RFC 2234 [1] | DQUOTE = %x22;as per section 6.1 of RFC 2234 [2] | |||
| LF = %x0A ;as per section 6.1 of RFC 2234 [1] | LF = %x0A ;as per section 6.1 of RFC 2234 [2] | |||
| CRLF = CR LF ;as per section 6.1 of RFC 2234 [1] | CRLF = CR LF ;as per section 6.1 of RFC 2234 [2] | |||
| VCHAR = %x21-7E ;as per section 6.1 of RFC 2234 [1] | VCHAR = %x21-7E ;as per section 6.1 of RFC 2234 [2] | |||
| 3. MIME Type Registration of text/csv | 3. MIME Type Registration of text/csv | |||
| This section provides the media-type registration application (as per | This section provides the media-type registration application (as per | |||
| RFC 2048 [4], which will be submitted to IANA after IESG approval of | RFC 2048 [1], which will be submitted to IANA after IESG approval of | |||
| this document. | this document. | |||
| To: ietf-types@iana.org | To: ietf-types@iana.org | |||
| Subject: Registration of MIME media type text/csv | Subject: Registration of MIME media type text/csv | |||
| MIME media type name: text | MIME media type name: text | |||
| MIME subtype name: csv | MIME subtype name: csv | |||
| Required parameters: none | Required parameters: none | |||
| Optional parameters: charset | Optional parameters: charset | |||
| Common usage of CSV is US-ASCII, but other character sets as | Common usage of CSV is US-ASCII, but other character sets as | |||
| defined by IANA for the "text" tree may be used. | defined by IANA for the "text" tree may be used. | |||
| Encoding considerations: | Encoding considerations: | |||
| As per section 4.1.1. of RFC 2046 [2], this media type uses CRLF | As per section 4.1.1. of RFC 2046 [3], this media type uses CRLF | |||
| to denote line breaks. However, implementors should be aware that | to denote line breaks. However, implementors should be aware that | |||
| some implementations may use other values. | some implementations may use other values. | |||
| Security considerations: | Security considerations: | |||
| CSV files contain passive text data which should not pose any | CSV files contain passive text data which should not pose any | |||
| risks. However, it is possible in theory that malicious binary | risks. However, it is possible in theory that malicious binary | |||
| data maybe included in order to exploit potential buffer overruns | data maybe included in order to exploit potential buffer overruns | |||
| in the program processing CSV data. Additionally, private data | in the program processing CSV data. Additionally, private data | |||
| maybe shared via this format (which of course applies to any text | maybe shared via this format (which of course applies to any text | |||
| data). | data). | |||
| Interoperability considerations: | Interoperability considerations: | |||
| Due to lack of a single specification there are considerable | Due to lack of a single specification there are considerable | |||
| differences among different implementations. Implementors should | differences among different implementations. Implementors should | |||
| "be conservative in what you do, be liberal in what you accept | "be conservative in what you do, be liberal in what you accept | |||
| from others" (RFC 793 [3]) when processing CSV files. An attempt | from others" (RFC 793 [8]) when processing CSV files. An attempt | |||
| at a common definition can be found in Section 2. | at a common definition can be found in Section 2. | |||
| Published specification: | Published specification: | |||
| While numerous private specifications exist for various programs | While numerous private specifications exist for various programs | |||
| and systems, there is no single "master" specification for this | and systems, there is no single "master" specification for this | |||
| format. An attempt at a common definition can be found in | format. An attempt at a common definition can be found in | |||
| Section 2. | Section 2. | |||
| Applications which use this media type: | Applications which use this media type: | |||
| skipping to change at page 6, line 39 | skipping to change at page 6, line 39 | |||
| After IESG approval, IANA is expected to register the MIME type | After IESG approval, IANA is expected to register the MIME type | |||
| "text/csv" using the application provided in Section 3 of this | "text/csv" using the application provided in Section 3 of this | |||
| document. | document. | |||
| 5. Security Considerations | 5. Security Considerations | |||
| See discussion above | See discussion above | |||
| 6. Acknowledgments | 6. Acknowledgments | |||
| The author would like to thank Dave Crocker, Martin Duerst and Bruce | The author would like to thank Dave Crocker, Martin Duerst, Clyde | |||
| Lilly for their helpful suggestions. A special word of thanks to | Ingram, Graham Klyne, Bruce Lilly and Chris Lilley for their helpful | |||
| Dave for helping with the ABNF grammar. | suggestions. A special word of thanks to Dave for helping with the | |||
| ABNF grammar. | ||||
| 7. References | 7. References | |||
| 7.1 Normative References | 7.1 Normative References | |||
| [1] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax | [1] Freed, N., Klensin, J. and J. Postel, "Multipurpose Internet | |||
| Mail Extensions (MIME) Part Four: Registration Procedures", | ||||
| BCP 13, RFC 2048, November 1996. | ||||
| [2] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax | ||||
| Specifications: ABNF", RFC 2234, November 1997. | Specifications: ABNF", RFC 2234, November 1997. | |||
| [2] Freed, N. and N. Borenstein, "Multipurpose Internet Mail | [3] Freed, N. and N. Borenstein, "Multipurpose Internet Mail | |||
| Extensions (MIME) Part Two: Media Types", RFC 2046, November | Extensions (MIME) Part Two: Media Types", RFC 2046, November | |||
| 1996. | 1996. | |||
| [3] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, | ||||
| September 1981. | ||||
| 7.2 Informative References | 7.2 Informative References | |||
| [4] Freed, N., Klensin, J. and J. Postel, "Multipurpose Internet | [4] Repici, J., "HOW-TO: The Comma Separated Value (CSV) File | |||
| Mail Extensions (MIME) Part Four: Registration Procedures", | ||||
| BCP 13, RFC 2048, November 1996. | ||||
| [5] Repici, J., "HOW-TO: The Comma Separated Value (CSV) File | ||||
| Format", 2004, | Format", 2004, | |||
| <http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm>. | <http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm>. | |||
| [6] Edoceo, Inc., "CSV Standard File Format", 2004, | [5] Edoceo, Inc., "CSV Standard File Format", 2004, | |||
| <http://www.edoceo.com/utilis/csv-file-format.php>. | <http://www.edoceo.com/utilis/csv-file-format.php>. | |||
| [7] Rodger, R. and O. Shanaghy, "Documentation for Ricebridge CSV | [6] Rodger, R. and O. Shanaghy, "Documentation for Ricebridge CSV | |||
| Manager", February 2005, | Manager", February 2005, | |||
| <http://www.ricebridge.com/products/csvman/reference.htm>. | <http://www.ricebridge.com/products/csvman/reference.htm>. | |||
| [8] Raymond, E., "The Art of Unix Programming, Chapter 5", September | [7] Raymond, E., "The Art of Unix Programming, Chapter 5", September | |||
| 2003, | 2003, | |||
| <http://www.catb.org/~esr/writings/taoup/html/ch05s02.html>. | <http://www.catb.org/~esr/writings/taoup/html/ch05s02.html>. | |||
| [8] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, | ||||
| September 1981. | ||||
| Author's Address | Author's Address | |||
| Yakov Shafranovich | Yakov Shafranovich | |||
| SolidMatrix Technologies, Inc. | SolidMatrix Technologies, Inc. | |||
| Email: ietf@shaftek.org | Email: ietf@shaftek.org | |||
| URI: http://www.shaftek.org | URI: http://www.shaftek.org | |||
| Appendix A. Status of This Document [To Be Removed Upon Publication] | Appendix A. Status of This Document [To Be Removed Upon Publication] | |||
| skipping to change at page 8, line 7 | skipping to change at page 8, line 8 | |||
| which is also reachable via <ietf-types@iana.org>. Of course, | which is also reachable via <ietf-types@iana.org>. Of course, | |||
| comments directly to the author are always welcome. | comments directly to the author are always welcome. | |||
| A.2 Document Repository | A.2 Document Repository | |||
| Copies of this and earlier versions including multiple formats can be | Copies of this and earlier versions including multiple formats can be | |||
| found at <http://www.shaftek.org/publications/drafts/mime-csv/>. | found at <http://www.shaftek.org/publications/drafts/mime-csv/>. | |||
| A.3 Document History | A.3 Document History | |||
| Changes from draft-shafranovich-mime-csv-02 to | ||||
| draft-shafranovich-mime-csv-03: | ||||
| o Changed text to prohibit the last field ending with a comma | ||||
| matching the ABNF grammar | ||||
| o The double quote escaping is now set to two double quotes instead | ||||
| of three | ||||
| o Moved some of the references between informative and normative | ||||
| sections | ||||
| Changes from draft-shafranovich-mime-csv-01 to | Changes from draft-shafranovich-mime-csv-01 to | |||
| draft-shafranovich-mime-csv-00: | draft-shafranovich-mime-csv-02: | |||
| o Minor errors in ABNF grammar corrected in response to AD comments | o Minor errors in ABNF grammar corrected in response to AD comments | |||
| o Minor spelling mistakes corrected | o Minor spelling mistakes corrected | |||
| Changes from draft-shafranovich-mime-csv-00 to | Changes from draft-shafranovich-mime-csv-00 to | |||
| draft-shafranovich-mime-csv-01: | draft-shafranovich-mime-csv-01: | |||
| o Type "text/comma-separated-values" has been removed | o Type "text/comma-separated-values" has been removed | |||
| o The "encoding consideration" paragraph of Section 3 has been | o The "encoding consideration" paragraph of Section 3 has been | |||
| changed to allow CRLF only as per section 4.1.1. of RFC 2046 [2]. | changed to allow CRLF only as per section 4.1.1. of RFC 2046 [3]. | |||
| This has been reflected in the ABNF grammar in Section 2. | This has been reflected in the ABNF grammar in Section 2. | |||
| o ABNF grammar in Section 2 has been cleaned up. | o ABNF grammar in Section 2 has been cleaned up. | |||
| o Acknowledgements and status sections were added. | o Acknowledgements and status sections were added. | |||
| o CSV format definition was moved to the normative section of the | o CSV format definition was moved to the normative section of the | |||
| document | document | |||
| Intellectual Property Statement | Intellectual Property Statement | |||
| End of changes. | ||||
This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/ | ||||