这个库是在看其他网页时,作为和功能丰富的xlsxio库的对比来的,按照xlsx_drone github页面介绍,
特征
- 不使用任何外部应用程序来解析它们。
- 注重速度而不是功能。
- 简单的接口。
- UTF-8 支持。
安装
直接将 src 和 ext 文件夹复制并粘贴到项目根文件夹和源代码中。您可能希望以不同的方式容纳文件是可以理解的,但请注意 xlsx_drone.h 使用相对路径调用其名称:#include “xlsx_drone.h”
// external libraries
#include "../ext/zip.h"
#include "../ext/sxmlc.h"
#include "../ext/sxmlsearch.h"
除此以外,连一个安装步骤都没有写,只给出了几个代码片段,没法直接使用,文档也就让你参考 src/xlsx_drone.h。
还好源代码包中有CMakelists.txt,那就表示能用cmake & make来生成。实际结果确实生成了动态库, 而且几乎瞬间生成。
/# cd par/xlsx_drone
/par/xlsx_drone# mkdir build
/par/xlsx_drone# cd build
/par/xlsx_drone/build# cmake ..
-- The C compiler identification is GNU 14.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /par/xlsx_drone/build
/par/xlsx_drone/build# make
[ 20%] Building C object CMakeFiles/xlsx_drone.dir/ext/zip.c.o
In file included from /par/xlsx_drone/ext/zip.c:37:
/par/xlsx_drone/ext/miniz.h:5106:9: note: '#pragma message: Using fopen, ftello, fseeko, stat() etc. path for file I/O - this path may not support large files.'
5106 | #pragma message( \
| ^~~~~~~
[ 40%] Building C object CMakeFiles/xlsx_drone.dir/ext/sxmlc.c.o
[ 60%] Building C object CMakeFiles/xlsx_drone.dir/ext/sxmlsearch.c.o
[ 80%] Building C object CMakeFiles/xlsx_drone.dir/src/xlsx_drone.c.o
[100%] Linking C shared library libxlsx_drone.so
[100%] Built target xlsx_drone
浏览源码目录树,有个test目录下面有个xlsx_drone.test.c源文件,其中还有个helpers目录,装了几个看上去用来测试的文件。
上手编译它,很多链接找不到引用错误,其中一些像单元测试函数,一些像xlsx_drone中的函数。
/par/xlsx_drone/test# ls
helpers xlsx_drone.test.c
/par/xlsx_drone/test# gcc xlsx_drone.test.c -o xlsx_drone
/usr/bin/ld: /tmp/ccqiKV5x.o: in function `test_xlsx_open':
xlsx_drone.test.c:(.text+0x1c): undefined reference to `xlsx_set_print_err_messages'
/usr/bin/ld: xlsx_drone.test.c:(.text+0x2d): undefined reference to `xlsx_open'
/usr/bin/ld: xlsx_drone.test.c:(.text+0x4c): undefined reference to `UnityAssertEqualNumber'
/usr/bin/ld: xlsx_drone.test.c:(.text+0x51): undefined reference to `xlsx_get_xlsx_errno'
注意到,此文件开头包含了#include "…/ext/unity.h"语句,而ext目录下除了有unity的头文件,还有c源代码,在Makefile中没有用到,那就加上unity.c一起编译,一下子,单元测试引用的错误全消失了,只留下xlsx类错误。
gcc xlsx_drone.test.c ../ext/unity.c -o xlsx_drone
/usr/bin/ld: /tmp/ccD6n0u6.o: in function `test_xlsx_open':
xlsx_drone.test.c:(.text+0x1c): undefined reference to `xlsx_set_print_err_messages'
/usr/bin/ld: xlsx_drone.test.c:(.text+0x2d): undefined reference to `xlsx_open'
再加上-L 和-l参数,就链接成功了,把共享动态库目录加入LD_LIBRARY_PATH,就能执行了。执行报错,好像是打开文件失败,
/par/xlsx_drone/test# gcc xlsx_drone.test.c ../ext/unity.c -o xlsx_drone -L /par/xlsx_drone/build -lxlsx_drone
/par/xlsx_drone/test# ./xlsx_drone
./xlsx_drone: error while loading shared libraries: libxlsx_drone.so: cannot open shared object file: No such file or directory
/par/xlsx_drone/test# export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/par/xlsx_drone/build
/par/xlsx_drone/test# ./xlsx_drone
xlsx_drone.test.c:19:test_xlsx_open:FAIL: Expected 1 Was 0
xlsx_drone.test.c:370:test_xlsx_load_sheet:FAIL: Expected 0 Was -13
Segmentation fault (core dumped)
看xlsx_drone.test.c源码,里面的路径是"test\helpers\non_existent.xlsx"之类,于是再在父目录执行,结果还是报错。最后把Windows的\目录分隔符全都替换成Linux的/,重新编译,执行成功,如下所示;
/par/xlsx_drone# gcc test/xlsx_drone.test2.c ext/unity.c -o xlsx_drone2 -L /par/xlsx_drone/build -lxlsx_drone
/par/xlsx_drone# ./xlsx_drone2
test/xlsx_drone.test2.c:16:test_xlsx_open:FAIL: Expected 0 Was 1
test/xlsx_drone.test2.c:1231:test_xlsx_load_sheet:PASS
test/xlsx_drone.test2.c:1232:test_xlsx_unload_sheet:PASS
test/xlsx_drone.test2.c:1233:test_xlsx_get_last_column:PASS
test/xlsx_drone.test2.c:476:test_xlsx_read_cell:FAIL: Unity Double Precision Disabled
test/xlsx_drone.test2.c:1235:test_xlsx_close:PASS
-----------------------
6 Tests 2 Failures 0 Ignored
FAIL
其中第一个测试是测不存在的文件,我故意放了一个non_existent.xlsx在helpers,导致期望0的测试执行结果为1,也报错了。
再把首页的代码片段放入一个main函数,并包含头文件,
#include "src/xlsx_drone.h"
int main()
{
// open *.xlsx
xlsx_workbook_t wb;
xlsx_open("test/helpers/sample.xlsx", &wb);
// be free to inspect some wb data
int number_of_sheets = wb.n_sheets;
// load sheet
xlsx_sheet_t *sheet_1 = xlsx_load_sheet(&wb, 1, NULL);
// be free to inspect some sheet data
char *sheet_name = sheet_1->name;
int last_row = sheet_1->last_row; // valued 0 if the sheet is empty
// as of version 0.2.0 you can retrieve the last column (more info below)
char *last_column = xlsx_get_last_column(sheet_1); // i.e.: "FB" or "R", etc.
// read cell
xlsx_cell_t cell_data_holder;
xlsx_read_cell(sheet_1, 4, "B", &cell_data_holder);
// inspect result
switch(cell_data_holder.value_type) {
case XLSX_POINTER_TO_CHAR:
printf("Cell 4B has value: %s", cell_data_holder.value.pointer_to_char_value);
break;
case XLSX_INT:
printf("Cell 4B has value: %d", cell_data_holder.value.int_value);
break;
case XLSX_LONG_LONG:
printf("Cell 4B has value: %lld", cell_data_holder.value.long_long_value);
break;
case XLSX_DOUBLE:
printf("Cell 4B has value: %f", cell_data_holder.value.double_value);
break;
default:
printf("Cell 4B has no value");
}
// you can also inspect the cell category
int cell_category = cell_data_holder.style->related_category;
/*
typedef enum xlsx_cell_category {
XLSX_NUMBER, // int, long long, or double
XLSX_TEXT, // string
XLSX_DATE, // int
XLSX_TIME, // double
XLSX_DATE_TIME, // double
XLSX_UNKNOWN
} xlsx_cell_category;
*/
char cat[7][10]={"NUMBER", "TEXT", "DATE", "TIME", "DATE_TIME", "UNKNOWN"};
printf("Cell 4B has category XLSX_%s\n", cat[cell_category]);
// when you're done reading the XLSX, close the workbook to properly free resources
xlsx_close(&wb);
}
编译运行结果如下
gcc example.c -o example -L /par/xlsx_drone/build -lxlsx_drone
/par/xlsx_drone# ./example
Cell 4B has value: -1000Cell 4B has category XLSX_NUMBER
再让DeepSeek照样编写一个把xlsx转为csv的程序。
#include "src/xlsx_drone.h"
#include <stdio.h> // 添加标准IO头文件用于CSV操作
int main()
{
// 打开 *.xlsx 文件
xlsx_workbook_t wb;
xlsx_open("test/helpers/sample.xlsx", &wb);
// 加载第一个工作表
xlsx_sheet_t *sheet = xlsx_load_sheet(&wb, 1, NULL);
// 创建并打开CSV文件用于写入
FILE *csv_file = fopen("output.csv", "w");
if (csv_file == NULL) {
printf("无法创建CSV文件\n");
xlsx_close(&wb);
return 1;
}
// 获取工作表尺寸信息
int last_row = sheet->last_row;
char *last_column = xlsx_get_last_column(sheet);
// 遍历所有行和列
for (int row = 1; row <= last_row; row++) {
for (char col_char = 'A'; col_char <= last_column[0]; col_char++) {
// 处理多字母列名(如AA, AB等)
char col_str[3] = {col_char, '\0'};
if (last_column[1] != '\0') {
// 如果需要处理两字母列名,这里需要扩展逻辑
}
// 读取单元格数据
xlsx_cell_t cell;
xlsx_read_cell(sheet, row, col_str, &cell);
// 根据数据类型写入CSV
switch(cell.value_type) {
case XLSX_POINTER_TO_CHAR:
fprintf(csv_file, "\"%s\"", cell.value.pointer_to_char_value);
break;
case XLSX_INT:
fprintf(csv_file, "%d", cell.value.int_value);
break;
case XLSX_LONG_LONG:
fprintf(csv_file, "%lld", cell.value.long_long_value);
break;
case XLSX_DOUBLE:
fprintf(csv_file, "%f", cell.value.double_value);
break;
default:
// 空单元格留空
break;
}
// 添加逗号分隔符(最后一列后不加逗号)
if (col_char < last_column[0]) {
fprintf(csv_file, ",");
}
}
// 换行表示新的一行
fprintf(csv_file, "\n");
}
// 关闭CSV文件
fclose(csv_file);
// 关闭工作簿并释放资源
xlsx_close(&wb);
printf("XLSX文件已成功转换为CSV格式\n");
return 0;
}
编译执行成功,前6行的内容如下,可见它支持UTF-8, 包括汉字希腊文。
gcc xlsx2csv.c -o xlsx2csv -L /par/xlsx_drone/build -lxlsx_drone
/par/xlsx_drone# ./xls2csv
bash: ./xls2csv: No such file or directory
/par/xlsx_drone# ./xlsx2csv
XLSX文件已成功转换为CSV格式
head -6 output.csv
"General","Number","Currency","Accounting","Date","Time","Percentage","Fraction (1.5)","Scientific (0.001)","Text","Special","Custom"
"Foo",1000,1000,147,43458,0.104792,0.500000,1.500000,0.001000,"1875",2000,12
235,1000,-14562.740000,1200.874000,43458,0.104792,0.450000,1.500000,0.001000,"Just text",2000,40955
17.890000,-1000,584,,43458,0.104792,1.600000,1.500000,0.001000,"𐐀34",543415635644,
,-1000,,,43458,0.104792,,1.500000,,"foo你bar好qaz",34580585,
,1200.561000,,,43458,0.104792,,1.500000,,,,
再用这个代码读取8*1M行的xlsx文件,看它的性能。
gcc xlsx2csv.c -o xlsx2csv -L /par/xlsx_drone/build -lxlsx_drone -O3
root@6ae32a5ffcde:/par/xlsx_drone# time ./xlsx2csv
XLSX文件已成功转换为CSV格式
real 0m14.381s
user 0m10.321s
sys 0m3.695s
尽管用了-O3编译,还是不太快。