轻量xlsx读取库xlsx_drone的编译与测试

发布于:2025-08-31 ⋅ 阅读:(17) ⋅ 点赞:(0)

这个库是在看其他网页时,作为和功能丰富的xlsxio库的对比来的,按照xlsx_drone github页面介绍,
特征

  • 不使用任何外部应用程序来解析它们。
  • 注重速度而不是功能。
  • 简单的接口。
  • UTF-8 支持。

安装
直接将 src 和 ext 文件夹复制并粘贴到项目根文件夹和源代码中。您可能希望以不同的方式容纳文件是可以理解的,但请注意 xlsx_drone.h 使用相对路径调用其名称:#include “xlsx_drone.h”

// external libraries
#include "../ext/zip.h"
#include "../ext/sxmlc.h"
#include "../ext/sxmlsearch.h"

除此以外,连一个安装步骤都没有写,只给出了几个代码片段,没法直接使用,文档也就让你参考 src/xlsx_drone.h。

还好源代码包中有CMakelists.txt,那就表示能用cmake & make来生成。实际结果确实生成了动态库, 而且几乎瞬间生成。

/# cd par/xlsx_drone
/par/xlsx_drone# mkdir build
/par/xlsx_drone# cd build
/par/xlsx_drone/build# cmake ..
-- The C compiler identification is GNU 14.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /par/xlsx_drone/build
/par/xlsx_drone/build# make
[ 20%] Building C object CMakeFiles/xlsx_drone.dir/ext/zip.c.o
In file included from /par/xlsx_drone/ext/zip.c:37:
/par/xlsx_drone/ext/miniz.h:5106:9: note: '#pragma message: Using fopen, ftello, fseeko, stat() etc. path for file I/O - this path may not support large files.'
 5106 | #pragma message(                                                               \
      |         ^~~~~~~
[ 40%] Building C object CMakeFiles/xlsx_drone.dir/ext/sxmlc.c.o
[ 60%] Building C object CMakeFiles/xlsx_drone.dir/ext/sxmlsearch.c.o
[ 80%] Building C object CMakeFiles/xlsx_drone.dir/src/xlsx_drone.c.o
[100%] Linking C shared library libxlsx_drone.so
[100%] Built target xlsx_drone

浏览源码目录树,有个test目录下面有个xlsx_drone.test.c源文件,其中还有个helpers目录,装了几个看上去用来测试的文件。
上手编译它,很多链接找不到引用错误,其中一些像单元测试函数,一些像xlsx_drone中的函数。

/par/xlsx_drone/test# ls
helpers  xlsx_drone.test.c
/par/xlsx_drone/test# gcc  xlsx_drone.test.c -o xlsx_drone
/usr/bin/ld: /tmp/ccqiKV5x.o: in function `test_xlsx_open':
xlsx_drone.test.c:(.text+0x1c): undefined reference to `xlsx_set_print_err_messages'
/usr/bin/ld: xlsx_drone.test.c:(.text+0x2d): undefined reference to `xlsx_open'
/usr/bin/ld: xlsx_drone.test.c:(.text+0x4c): undefined reference to `UnityAssertEqualNumber'
/usr/bin/ld: xlsx_drone.test.c:(.text+0x51): undefined reference to `xlsx_get_xlsx_errno'

注意到,此文件开头包含了#include "…/ext/unity.h"语句,而ext目录下除了有unity的头文件,还有c源代码,在Makefile中没有用到,那就加上unity.c一起编译,一下子,单元测试引用的错误全消失了,只留下xlsx类错误。

gcc  xlsx_drone.test.c ../ext/unity.c -o xlsx_drone

/usr/bin/ld: /tmp/ccD6n0u6.o: in function `test_xlsx_open':
xlsx_drone.test.c:(.text+0x1c): undefined reference to `xlsx_set_print_err_messages'
/usr/bin/ld: xlsx_drone.test.c:(.text+0x2d): undefined reference to `xlsx_open'

再加上-L 和-l参数,就链接成功了,把共享动态库目录加入LD_LIBRARY_PATH,就能执行了。执行报错,好像是打开文件失败,

/par/xlsx_drone/test# gcc  xlsx_drone.test.c ../ext/unity.c -o xlsx_drone -L /par/xlsx_drone/build -lxlsx_drone
/par/xlsx_drone/test# ./xlsx_drone
./xlsx_drone: error while loading shared libraries: libxlsx_drone.so: cannot open shared object file: No such file or directory

/par/xlsx_drone/test# export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/par/xlsx_drone/build
/par/xlsx_drone/test# ./xlsx_drone
xlsx_drone.test.c:19:test_xlsx_open:FAIL: Expected 1 Was 0
xlsx_drone.test.c:370:test_xlsx_load_sheet:FAIL: Expected 0 Was -13
Segmentation fault (core dumped)

看xlsx_drone.test.c源码,里面的路径是"test\helpers\non_existent.xlsx"之类,于是再在父目录执行,结果还是报错。最后把Windows的\目录分隔符全都替换成Linux的/,重新编译,执行成功,如下所示;

/par/xlsx_drone# gcc  test/xlsx_drone.test2.c ext/unity.c -o xlsx_drone2 -L /par/xlsx_drone/build -lxlsx_drone
/par/xlsx_drone# ./xlsx_drone2
test/xlsx_drone.test2.c:16:test_xlsx_open:FAIL: Expected 0 Was 1
test/xlsx_drone.test2.c:1231:test_xlsx_load_sheet:PASS
test/xlsx_drone.test2.c:1232:test_xlsx_unload_sheet:PASS
test/xlsx_drone.test2.c:1233:test_xlsx_get_last_column:PASS
test/xlsx_drone.test2.c:476:test_xlsx_read_cell:FAIL: Unity Double Precision Disabled
test/xlsx_drone.test2.c:1235:test_xlsx_close:PASS

-----------------------
6 Tests 2 Failures 0 Ignored
FAIL

其中第一个测试是测不存在的文件,我故意放了一个non_existent.xlsx在helpers,导致期望0的测试执行结果为1,也报错了。

再把首页的代码片段放入一个main函数,并包含头文件,

#include "src/xlsx_drone.h"
int main()
{
	// open *.xlsx
	xlsx_workbook_t wb;
	xlsx_open("test/helpers/sample.xlsx", &wb);
	// be free to inspect some wb data
	int number_of_sheets = wb.n_sheets;
	// load sheet
	xlsx_sheet_t *sheet_1 = xlsx_load_sheet(&wb, 1, NULL);
	// be free to inspect some sheet data
	char *sheet_name = sheet_1->name;
	int last_row = sheet_1->last_row; // valued 0 if the sheet is empty
	// as of version 0.2.0 you can retrieve the last column (more info below)
	char *last_column = xlsx_get_last_column(sheet_1); // i.e.: "FB" or "R", etc.
	// read cell
	xlsx_cell_t cell_data_holder;    
	xlsx_read_cell(sheet_1, 4, "B", &cell_data_holder);
	// inspect result
	switch(cell_data_holder.value_type) {
	  case XLSX_POINTER_TO_CHAR:
		printf("Cell 4B has value: %s", cell_data_holder.value.pointer_to_char_value);
		break;
	  case XLSX_INT:
		printf("Cell 4B has value: %d", cell_data_holder.value.int_value);
		break;
	  case XLSX_LONG_LONG:
		printf("Cell 4B has value: %lld", cell_data_holder.value.long_long_value);
		break;
	  case XLSX_DOUBLE:
		printf("Cell 4B has value: %f", cell_data_holder.value.double_value);
		break;
	  default:
		printf("Cell 4B has no value");
	}
	// you can also inspect the cell category
	int cell_category = cell_data_holder.style->related_category;	
	/*
	typedef enum xlsx_cell_category {
	  XLSX_NUMBER, // int, long long, or double
	  XLSX_TEXT, // string
	  XLSX_DATE, // int
	  XLSX_TIME, // double
	  XLSX_DATE_TIME, // double
	  XLSX_UNKNOWN
	} xlsx_cell_category;
	*/
	char cat[7][10]={"NUMBER", "TEXT", "DATE", "TIME", "DATE_TIME", "UNKNOWN"};
	printf("Cell 4B has category XLSX_%s\n", cat[cell_category]);

	// when you're done reading the XLSX, close the workbook to properly free resources
	xlsx_close(&wb);	
}

编译运行结果如下

gcc example.c -o example -L /par/xlsx_drone/build -lxlsx_drone
/par/xlsx_drone# ./example
Cell 4B has value: -1000Cell 4B has category XLSX_NUMBER

再让DeepSeek照样编写一个把xlsx转为csv的程序。

#include "src/xlsx_drone.h"
#include <stdio.h> // 添加标准IO头文件用于CSV操作

int main()
{
    // 打开 *.xlsx 文件
    xlsx_workbook_t wb;
    xlsx_open("test/helpers/sample.xlsx", &wb);
    
    // 加载第一个工作表
    xlsx_sheet_t *sheet = xlsx_load_sheet(&wb, 1, NULL);
    
    // 创建并打开CSV文件用于写入
    FILE *csv_file = fopen("output.csv", "w");
    if (csv_file == NULL) {
        printf("无法创建CSV文件\n");
        xlsx_close(&wb);
        return 1;
    }
    
    // 获取工作表尺寸信息
    int last_row = sheet->last_row;
    char *last_column = xlsx_get_last_column(sheet);
    
    // 遍历所有行和列
    for (int row = 1; row <= last_row; row++) {
        for (char col_char = 'A'; col_char <= last_column[0]; col_char++) {
            // 处理多字母列名(如AA, AB等)
            char col_str[3] = {col_char, '\0'};
            if (last_column[1] != '\0') {
                // 如果需要处理两字母列名,这里需要扩展逻辑
            }
            
            // 读取单元格数据
            xlsx_cell_t cell;
            xlsx_read_cell(sheet, row, col_str, &cell);
            
            // 根据数据类型写入CSV
            switch(cell.value_type) {
                case XLSX_POINTER_TO_CHAR:
                    fprintf(csv_file, "\"%s\"", cell.value.pointer_to_char_value);
                    break;
                case XLSX_INT:
                    fprintf(csv_file, "%d", cell.value.int_value);
                    break;
                case XLSX_LONG_LONG:
                    fprintf(csv_file, "%lld", cell.value.long_long_value);
                    break;
                case XLSX_DOUBLE:
                    fprintf(csv_file, "%f", cell.value.double_value);
                    break;
                default:
                    // 空单元格留空
                    break;
            }
            
            // 添加逗号分隔符(最后一列后不加逗号)
            if (col_char < last_column[0]) {
                fprintf(csv_file, ",");
            }
        }
        // 换行表示新的一行
        fprintf(csv_file, "\n");
    }
    
    // 关闭CSV文件
    fclose(csv_file);
    
    // 关闭工作簿并释放资源
    xlsx_close(&wb);
    
    printf("XLSX文件已成功转换为CSV格式\n");
    return 0;
}

编译执行成功,前6行的内容如下,可见它支持UTF-8, 包括汉字希腊文。

gcc xlsx2csv.c -o xlsx2csv -L /par/xlsx_drone/build -lxlsx_drone
/par/xlsx_drone# ./xls2csv
bash: ./xls2csv: No such file or directory
/par/xlsx_drone# ./xlsx2csv
XLSX文件已成功转换为CSV格式

head -6 output.csv
"General","Number","Currency","Accounting","Date","Time","Percentage","Fraction (1.5)","Scientific (0.001)","Text","Special","Custom"
"Foo",1000,1000,147,43458,0.104792,0.500000,1.500000,0.001000,"1875",2000,12
235,1000,-14562.740000,1200.874000,43458,0.104792,0.450000,1.500000,0.001000,"Just text",2000,40955
17.890000,-1000,584,,43458,0.104792,1.600000,1.500000,0.001000,"𐐀34",543415635644,
,-1000,,,43458,0.104792,,1.500000,,"foo你bar好qaz",34580585,
,1200.561000,,,43458,0.104792,,1.500000,,,,

再用这个代码读取8*1M行的xlsx文件,看它的性能。

gcc xlsx2csv.c -o xlsx2csv -L /par/xlsx_drone/build -lxlsx_drone -O3
root@6ae32a5ffcde:/par/xlsx_drone# time ./xlsx2csv
XLSX文件已成功转换为CSV格式

real    0m14.381s
user    0m10.321s
sys     0m3.695s

尽管用了-O3编译,还是不太快。


网站公告

今日签到

点亮在社区的每一天
去签到