| TOC |
|
By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”
The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.
This Internet-Draft will expire on October 7, 2005.
Copyright © The Internet Society (2005).
This document documents the format used for Comma-Separated Values (CSV) files and registers the associated MIME type "text/csv".
1.
Introduction
2.
Definition of the CSV format
3.
MIME Type Registration of text/csv
4.
IANA Considerations
5.
Security Considerations
6.
Acknowledgments
7.
References
7.1
Normative References
7.2
Informative References
§
Author's Address
A.
Status of This Document [To Be Removed Upon Publication]
A.1
Discussion Venue
A.2
Document Repository
A.3
Document History
§
Intellectual Property and Copyright Statements
| TOC |
The comma separated values format (CSV) has been used for exchanging and converting data between various spreadsheet programs for quite some time. Surprisingly, while this format is very common it has never been formally documented. Additionally, while the IANA MIME registration tree includes a registration for "text/tab-separated-values" type, no MIME types have ever been registered with IANA for CSV. At the same time, various programs and operating systems have begun to use different MIME types for this format, many of which vary from system to system. This document seeks to document the format of comma separated values (CSV) files and to formally register the "text/csv" MIME type for CSV in accordance with RFC 2048 (Freed, N., Klensin, J., and J. Postel, “Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures,” November 1996.)[1].
| TOC |
While there are various specifications and implementations for the CSV format (for ex. [4] (Repici, J., “HOW-TO: The Comma Separated Value (CSV) File Format,” 2004.), [5] (Edoceo, Inc., “CSV Standard File Format,” 2004.), [6] (Rodger, R. and O. Shanaghy, “Documentation for Ricebridge CSV Manager,” February 2005.) and [7] (Raymond, E., “The Art of Unix Programming, Chapter 5,” September 2003.)), no formal specification exists which causes a wide variety of interpretations for CSV files. This section seeks to document the format that seems to be followed by most implementations:
The ABNF grammar (Crocker, D., Ed. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” November 1997.)[2] appears as follows:
file = [header CRLF] record *(CRLF record) [CRLF]
header = name *(COMMA name)
record = field *(COMMA field)
name = field
field = (escaped / non-escaped)
escaped = DQUOTE *(TEXTDATA / COMMA / CR / LF / 2DQUOTE) DQUOTE
non-escaped = *TEXTDATA
COMMA = %x2C
CR = %x0D ;as per section 6.1 of RFC 2234 (Crocker, D., Ed. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” November 1997.)[2]
DQUOTE = %x22 ;as per section 6.1 of RFC 2234 (Crocker, D., Ed. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” November 1997.)[2]
LF = %x0A ;as per section 6.1 of RFC 2234 (Crocker, D., Ed. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” November 1997.)[2]
CRLF = CR LF ;as per section 6.1 of RFC 2234 (Crocker, D., Ed. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” November 1997.)[2]
TEXTDATA = %x20-21 / %x23-2B / %x2D-7E
| TOC |
This section provides the media-type registration application (as per RFC 2048 (Freed, N., Klensin, J., and J. Postel, “Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures,” November 1996.)[1], which will be submitted to IANA after IESG approval of this document.
To: ietf-types@iana.org
Subject: Registration of MIME media type text/csv
MIME media type name: text
MIME subtype name: csv
Required parameters: none
Optional parameters: charset, header
Common usage of CSV is US-ASCII, but other character sets as defined by IANA for the "text" tree may be used in conjuction with the "charset" parameter.
The "header" parameter indicates the presence or absence of the header line. Valid values are "present" or "absent". Implementators choosing not to use this parameter must make their own decisions as to whether the header line is present or absent.
Encoding considerations:
As per section 4.1.1. of RFC 2046 (Freed, N. and N. Borenstein, “Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types,” November 1996.)[3], this media type uses CRLF to denote line breaks. However, implementors should be aware that some implementations may use other values.
Security considerations:
CSV files contain passive text data which should not pose any risks. However, it is possible in theory that malicious binary data maybe included in order to exploit potential buffer overruns in the program processing CSV data. Additionally, private data maybe shared via this format (which of course applies to any text data).
Interoperability considerations:
Due to lack of a single specification there are considerable differences among different implementations. Implementors should "be conservative in what you do, be liberal in what you accept from others" (RFC 793 (Postel, J., “Transmission Control Protocol,” September 1981.)[8]) when processing CSV files. An attempt at a common definition can be found in Section 2 (Definition of the CSV format).
Implementations deciding not to use the optional "header" parameter must make their own decision as to whether the header is absent or present.
Published specification:
While numerous private specifications exist for various programs and systems, there is no single "master" specification for this format. An attempt at a common definition can be found in Section 2 (Definition of the CSV format).
Applications which use this media type:
Spreadsheet programs and various data conversion utilities
Additional information:
Magic number(s): none
File extension(s): CSV
Macintosh File Type Code(s): TEXT
Person & email address to contact for further information:
Yakov Shafranovich <ietf@shaftek.org>
Intended usage: COMMON
Author/Change controller: IESG
| TOC |
After IESG approval, IANA is expected to register the MIME type "text/csv" using the application provided in Section 3 (MIME Type Registration of text/csv) of this document.
| TOC |
See discussion above
| TOC |
The author would like to thank Dave Crocker, Martin Duerst, Joel M. Halpern, Clyde Ingram, Graham Klyne, Bruce Lilly, Chris Lilley and members of the IESG for their helpful suggestions. A special word of thanks to Dave for helping with the ABNF grammar.
The author would also like to thank Henrik Lefkowetz, Marshall Rose and the folks at xml.resource.org for providing many of the tools used for preparing RFCs and Internet drafts.
| TOC |
| TOC |
| [1] | Freed, N., Klensin, J., and J. Postel, “Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures,” BCP 13, RFC 2048, November 1996 (TXT, HTML, XML). |
| [2] | Crocker, D., Ed. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” RFC 2234, November 1997 (TXT, HTML, XML). |
| [3] | Freed, N. and N. Borenstein, “Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types,” RFC 2046, November 1996. |
| TOC |
| [4] | Repici, J., “HOW-TO: The Comma Separated Value (CSV) File Format,” 2004 (HTML). |
| [5] | Edoceo, Inc., “CSV Standard File Format,” 2004 (HTML). |
| [6] | Rodger, R. and O. Shanaghy, “Documentation for Ricebridge CSV Manager,” February 2005 (HTML). |
| [7] | Raymond, E., “The Art of Unix Programming, Chapter 5,” September 2003 (HTML). |
| [8] | Postel, J., “Transmission Control Protocol,” STD 7, RFC 793, September 1981. |
| TOC |
| Yakov Shafranovich | |
| SolidMatrix Technologies, Inc. | |
| Email: | ietf@shaftek.org |
| URI: | http://www.shaftek.org |
| TOC |
Discussion about this document should be directed to the IETF-TYPES mailing list http://www.alvestrand.no/mailman/listinfo/ietf-types/ which is also reachable via ietf-types@iana.org. Of course, comments directly to the author are always welcome.
Copies of this and earlier versions including multiple formats can be found at http://www.shaftek.org/publications/drafts/mime-csv/.
Changes from draft-shafranovich-mime-csv-04 to draft-shafranovich-mime-csv-05:
Changes from draft-shafranovich-mime-csv-03 to draft-shafranovich-mime-csv-04:
Changes from draft-shafranovich-mime-csv-02 to draft-shafranovich-mime-csv-03:
Changes from draft-shafranovich-mime-csv-01 to draft-shafranovich-mime-csv-02:
Changes from draft-shafranovich-mime-csv-00 to draft-shafranovich-mime-csv-01:
| TOC |
The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org.
This document and the information contained herein are provided on an “AS IS” basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Copyright © The Internet Society (2005). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights.
Funding for the RFC Editor function is currently provided by the Internet Society.