TDengine Database產(chǎn)品提供服務(wù)端和客戶端兩個(gè)部分,其中服務(wù)端目前僅支持Linux環(huán)境,客戶端通過提供庫的方式,支持用戶在Linux環(huán)境或Windows環(huán)境運(yùn)行。
TDengine Database服務(wù)或客戶端由于bug的存在,會(huì)出現(xiàn)crash的情況。當(dāng)crash的時(shí)候,需要能生成core文件,支持更高效的分析原因,從而能快速的定位并解決bug。
Linux環(huán)境生成core的方法
在Linux環(huán)境中,生成core的兩個(gè)條件:
1、core file size不為0。
該值缺省是0,表示不開啟core的生成,需要修改成不為0的值; 2、設(shè)置保存core文件的目錄有寫權(quán)限。
缺省是在運(yùn)行程序的當(dāng)前目錄下,或systemctl啟動(dòng)的程序則在“/”根目錄下。生成的core文件名為core,且多次發(fā)生crash時(shí),只會(huì)保存最早一次core文件。
TDengine Database可以使用兩種方式來實(shí)現(xiàn)。
一、shell命令方式
1、設(shè)置不限制core file size
ulimit -c unlimited
2、設(shè)置生成core文件的目錄
sudo sysctl -w kernel.core_pattern='/<corefile_dir>/core-%e-%p'
其中 corefile_dir : 生成core后保存的路徑(需要提前建好)。%e 和 %p是在名稱中增加程序名和pid。這樣就可以保存多次core文件。
缺點(diǎn):這樣會(huì)將系統(tǒng)環(huán)境都修改了,用戶其他程序crash時(shí),生成core文件也會(huì)保存到指定的路徑中!?。?!
如果不修改kernel.core_pattern參數(shù),只讓生成core文件名稱中增加pid,可以將 kernel.core_used_pid修改成1。
sudo sysctl kernel.core_uses_pid=1
注:啟動(dòng)程序時(shí)的用戶權(quán)限,需要能有<corefile_dir>目錄的寫權(quán)限。
二、代碼API方式
可以使用系統(tǒng)提供的函數(shù)來完成上述參數(shù)的設(shè)置。
代碼示例如下:
// 1. set ulimit -c unlimited
struct rlimit rlim;
struct rlimit rlim_new;
if (getrlimit(RLIMIT_CORE, &rlim) == 0) {
pPrint("the old unlimited para: rlim_cur=%d, rlim_max=%d", rlim.rlim_cur, rlim.rlim_max);
rlim_new.rlim_cur = RLIM_INFINITY;
rlim_new.rlim_max = RLIM_INFINITY;
if (setrlimit(RLIMIT_CORE, &rlim_new) != 0) {
pPrint("set unlimited fail, error: %s", strerror(errno));
rlim_new.rlim_cur = rlim.rlim_max;
rlim_new.rlim_max = rlim.rlim_max;
(void)setrlimit(RLIMIT_CORE, &rlim_new);
}
}
if (getrlimit(RLIMIT_CORE, &rlim) == 0) {
pPrint("the new unlimited para: rlim_cur=%d, rlim_max=%d", rlim.rlim_cur, rlim.rlim_max);
}
// 2. set pid into core file name
struct __sysctl_args args;
int old_usespid = 0;
size_t old_len = 0;
int new_usespid = 1;
size_t new_len = sizeof(new_usespid);
int name[] = {CTL_KERN, KERN_CORE_USES_PID};
memset(&args, 0, sizeof(struct __sysctl_args));
args.name = name;
args.nlen = sizeof(name)/sizeof(name[0]);
args.oldval = &old_usespid;
args.oldlenp = &old_len;
args.newval = &new_usespid;
args.newlen = new_len;
old_len = sizeof(old_usespid);
if (syscall(SYS__sysctl, &args) == -1) {
pPrint("_sysctl(kern_core_uses_pid) set fail: %s", strerror(errno));
}
pPrint("The old core_uses_pid[%d]: %d", old_len, old_usespid);
old_usespid = 0;
old_len = 0;
memset(&args, 0, sizeof(struct __sysctl_args));
args.name = name;
args.nlen = sizeof(name)/sizeof(name[0]);
args.oldval = &old_usespid;
args.oldlenp = &old_len;
old_len = sizeof(old_usespid);
if (syscall(SYS__sysctl, &args) == -1) {
pPrint("_sysctl(kern_core_uses_pid) get fail: %s", strerror(errno));
}
pPrint("The new core_uses_pid[%d]: %d", old_len, old_usespid);
設(shè)置kernal.core_pattern的函數(shù)示例:
// 3. create the path for saving core file
int status;
char coredump_dir[32] = "/var/log/taosdump";
if (opendir(coredump_dir) == NULL) {
status = mkdir(coredump_dir, S_IRWXU | S_IRWXG | S_IRWXO);
if (status) {
pPrint("mkdir fail, error: %s\n", strerror(errno));
}
}
// 4. set kernel.core_pattern
struct __sysctl_args args;
char old_corefile[128];
size_t old_len;
char new_corefile[128] = "/var/log/taosdump/core-%e-%p";
size_t new_len = sizeof(new_corefile);
int name[] = {CTL_KERN, KERN_CORE_PATTERN};
memset(&args, 0, sizeof(struct __sysctl_args));
args.name = name;
args.nlen = sizeof(name)/sizeof(name[0]);
args.oldval = old_corefile;
args.oldlenp = &old_len;
args.newval = new_corefile;
args.newlen = new_len;
old_len = sizeof(old_corefile);
if (syscall(SYS__sysctl, &args) == -1) {
pPrint("_sysctl(kern_core_pattern) set fail: %s", strerror(errno));
}
pPrint("The old kern_core_pattern: %*s\n", old_len, old_corefile);
memset(&args, 0, sizeof(struct __sysctl_args));
args.name = name;
args.nlen = sizeof(name)/sizeof(name[0]);
args.oldval = old_corefile;
args.oldlenp = &old_len;
old_len = sizeof(old_corefile);
if (syscall(SYS__sysctl, &args) == -1) {
pPrint("_sysctl(kern_core_pattern) get fail: %s", strerror(errno));
}
pPrint("The new kern_core_pattern: %*s\n", old_len, old_corefile);
當(dāng)在linux環(huán)境出現(xiàn)crash時(shí),獲得core文件以及對(duì)應(yīng)的應(yīng)用程序,然后使用gdb進(jìn)行分析,示例如下:
plum@plum-VirtualBox:~/git/TDinternal/debug/build/bin$ sudo gdb ./taos core-taos-22675
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./taos...done.
Core was generated by `/home/plum/git/TDengine/debug/build/bin/taos -c /etc/taos'.
(gdb) bt
#0 0x00000000004261d4 in tsParseOneRowData (str=0x980, pDataBlocks=0x7fffd8033ca0, schema=0x7fffd8034884, spd=0x7fffedcd2660, error=0x79fbf0 "", timePrec=0)
at /home/plum/git/TDinternal/community/src/client/src/tscParseInsert.c:411
#1 0x00000000004266f8 in tsParseValues (str=0x7fffedcd23b0, pDataBlock=0x7fffd8033ca0, pMeterMeta=0x7fffd803483c, maxRows=14, spd=0x7fffedcd2660, error=0x79fbf0 "")
at /home/plum/git/TDinternal/community/src/client/src/tscParseInsert.c:508
#2 0x0000000000426d8b in doParseInsertStatement (pSql=0x797dd0, pTableHashList=0x7fffd80342e0, str=0x7fffedcd23b0, spd=0x7fffedcd2660, totalNum=0x7fffedcd23cc)
at /home/plum/git/TDinternal/community/src/client/src/tscParseInsert.c:640
#3 0x00000000004281da in doParserInsertSql (pSql=0x797dd0, str=0x7fffd8034c64 "0, '123', '11111\\'9911111');") at /home/plum/git/TDinternal/community/src/client/src/tscParseInsert.c:1036
#4 0x00000000004284b9 in tsParseInsertSql (pSql=0x797dd0, sql=0x7fffd8034c30 "insert into test.demo (ts, task_id, message) values(0, '123', '11111\\'9911111');", acct=0x7074a2 "0", db=0x7074c2 "0.test")
at /home/plum/git/TDinternal/community/src/client/src/tscParseInsert.c:1103
#5 0x00000000004285f2 in tsParseSql (pSql=0x797dd0, acct=0x7074a2 "0", db=0x7074c2 "0.test", multiVnodeInsertion=false) at /home/plum/git/TDinternal/community/src/client/src/tscParseInsert.c:1127
#6 0x0000000000455572 in taos_query_imp (pObj=0x707450, pSql=0x797dd0) at /home/plum/git/TDinternal/community/src/client/src/tscSql.c:235
#7 0x0000000000455907 in taos_query (taos=0x707450, sqlstr=0x7fffd80008c0 "insert into test.demo (ts, task_id, message) values(0, '123', '11111\\'9911111');")
at /home/plum/git/TDinternal/community/src/client/src/tscSql.c:293
#8 0x0000000000414b01 in shellRunCommandOnServer (con=0x707450, command=0x7fffd80008c0 "insert into test.demo (ts, task_id, message) values(0, '123', '11111\\'9911111');")
at /home/plum/git/TDinternal/community/src/kit/shell/src/shellEngine.c:253
#9 0x00000000004149ad in shellRunCommand (con=0x707450, command=0x7fffd80008c0 "insert into test.demo (ts, task_id, message) values(0, '123', '11111\\'9911111');")
at /home/plum/git/TDinternal/community/src/kit/shell/src/shellEngine.c:215
#10 0x0000000000417b75 in shellLoopQuery (arg=0x707450) at /home/plum/git/TDinternal/community/src/kit/shell/src/shellLinux.c:291
#11 0x00007ffff7bc16ba in start_thread (arg=0x7fffedcd4700) at pthread_create.c:333
#12 0x00007ffff73e641d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
windows環(huán)境生成core的方法
一、設(shè)置生成dmp文件方式
通過改注冊(cè)表的設(shè)置讓操作系統(tǒng)在程序crash的時(shí)候自動(dòng)生成dump,并放到特定的目錄下
增加注冊(cè)表HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps
添加項(xiàng)如下圖:

其中DumpType代表的含義是:0 = Create a custom dump ,1 = Mini dump ,2 = Full dump
程序崩潰后,就會(huì)在c:\TDengine目錄下生成dump文件。
二、使用VS調(diào)試dmp文件
用VS打開dmp文件。測(cè)試時(shí)dmp文件時(shí)本地產(chǎn)生的,因此VS會(huì)依據(jù)dmp文件自行找到exe,pdb和源代碼的路徑。因此直接點(diǎn)擊調(diào)試,程序會(huì)出錯(cuò)代碼行中斷。

但若dump文件是exe在另一臺(tái)機(jī)器上產(chǎn)生的,則我們最好把exe,pdb,dmp放到同一文件夾下,必須保證pdb與出問題的exe是同一時(shí)間生成的,用VS打開dump文件后還需要設(shè)置符號(hào)表文件路徑和源代碼路徑:
(1) 當(dāng)把pdb文件與dmp文件放入同一目錄下時(shí),就不需設(shè)置其路徑,否則需要設(shè)置
工具->選項(xiàng)->調(diào)試->符號(hào):

2)還需設(shè)置源代碼路徑:
屬性->調(diào)試源代碼:

這樣點(diǎn)擊“使用僅限本機(jī)進(jìn)行調(diào)試”,即可在出錯(cuò)代碼行中斷:




互聯(lián)網(wǎng).png)



-1.png)




.png)


證.png)


伙伴.png)
伙伴.png)
伙伴.png)



