Statistics Based Checkers in the Clang Static Analyzer Ádám Balogh adam.balogh@ericsson.com Euro LLVM 2019, Brussels, Belgium Ericsson 2019-04-08 Ericsson Internal | 2018-02-21
The Problem — Huge legacy code with weak documentation Ericsson Internal | 2018-02-21
The Problem — Huge legacy code with weak documentation X.h int may_return_return_negative(); // no body available Ericsson Internal | 2018-02-21
The Problem — Huge legacy code with weak documentation X.h int may_return_return_negative(); // no body available — Many calls for may_return_negative() , return value is checked for negative in e.g. 98% of the calls Ericsson Internal | 2018-02-21
The Problem — Huge legacy code with weak documentation X.h int may_return_return_negative(); // no body available — Many calls for may_return_negative() , return value is checked for negative in e.g. 98% of the calls Y1.c int i = may_return_return_negative(); if (i < 0) return; v[i]; // OK. Ericsson Internal | 2018-02-21
The Problem — Huge legacy code with weak documentation X.h int may_return_return_negative(); // no body available — Many calls for may_return_negative() , return value is checked for negative in e.g. 98% of the calls Y1.c Y2.c int i = may_return_return_negative(); int i = may_return_return_negative(); if (i < 0) if (i < 0) return; return; v[i]; // OK. v[i]; // OK. Ericsson Internal | 2018-02-21
The Problem — Huge legacy code with weak documentation X.h int may_return_return_negative(); // no body available — Many calls for may_return_negative() , return value is checked for negative in e.g. 98% of the calls Y1.c Y2.c int i = may_return_return_negative(); Y3.c int i = may_return_return_negative(); if (i < 0) int i = may_return_return_negative(); if (i < 0) return; if (i < 0) return; v[i]; // OK. return; v[i]; // OK. v[i]; // OK. Ericsson Internal | 2018-02-21
The Problem — Huge legacy code with weak documentation X.h int may_return_return_negative(); // no body available — Many calls for may_return_negative() , return value is checked for negative in e.g. 98% of the calls Y1.c Y2.c int i = may_return_return_negative(); Y3.c int i = may_return_return_negative(); Y4.c if (i < 0) int i = may_return_return_negative(); if (i < 0) int i = may_return_return_negative(); return; if (i < 0) return; if (i < 0) v[i]; // OK. return; v[i]; // OK. return; v[i]; // OK. v[i]; // OK. Ericsson Internal | 2018-02-21
The Problem — Huge legacy code with weak documentation X.h int may_return_return_negative(); // no body available — Many calls for may_return_negative() , return value is checked for negative in e.g. 98% of the calls Y1.c Y2.c int i = may_return_return_negative(); Y3.c Yn.c int i = may_return_return_negative(); Y4.c if (i < 0) int i = may_return_return_negative(); int i = may_return_return_negative(); if (i < 0) int i = may_return_return_negative(); return; if (i < 0) v[i]; // error: negative indexing return; if (i < 0) v[i]; // OK. return; v[i]; // OK. return; v[i]; // OK. v[i]; // OK. Ericsson Internal | 2018-02-21
The Problem — Huge legacy code with weak documentation X.h int may_return_return_negative(); // no body available — Many calls for may_return_negative() , return value is checked for negative in e.g. 98% of the calls Y1.c Y2.c int i = may_return_return_negative(); Y3.c Yn.c int i = may_return_return_negative(); Y4.c if (i < 0) int i = may_return_return_negative(); int i = may_return_return_negative(); if (i < 0) int i = may_return_return_negative(); return; if (i < 0) v[i]; // error: negative indexing return; if (i < 0) v[i]; // OK. return; v[i]; // OK. return; v[i]; // OK. v[i]; // OK. — Goal: detect the 2% where the negativeness of the return value is not checked Ericsson Internal | 2018-02-21
What do we check? Ignored Return Value Find calls where the return value is ignored but it should not Ignored.c fread (data, …); // data may be garbage // if read failed Ericsson Internal | 2018-02-21
What do we check? Ignored Return Value Special Return Value Find calls where the return value is Negative Integers ignored but it should not Find calls where the integer return value is not checked for negative but it should Ignored.c fread (data, …); // data may be garbage // if read failed NegRet.c int i = ret_neg(); v[i]; // error Ericsson Internal | 2018-02-21
What do we check? Ignored Return Value Special Return Value Find calls where the return value is Negative Integers Null Pointers ignored but it should not Find calls where the integer Find calls where the pointer return value is not checked for return value is not checked for negative but it should null pointer but it should Ignored.c fread (data, …); // data may be garbage // if read failed NegRet.c NullRet.c int i = ret_neg(); T *t = ret_null(); v[i]; // error t->field; // error Ericsson Internal | 2018-02-21
What do we check? Ignored Return Value Special Return Value Find calls where the return value is Negative Integers Null Pointers ignored but it should not Find calls where the integer Find calls where the pointer return value is not checked for return value is not checked for negative but it should null pointer but it should Ignored.c fread (data, …); // data may be garbage // if read failed NegRet.c NullRet.c int i = ret_neg(); T *t = ret_null(); v[i]; // error t->field; // error Other “special” values may be added Ericsson Internal | 2018-02-21
How does it work? X.c Y.cpp X.c Y.cpp X.c Y.cpp *.c *.cpp Ericsson Internal | 2018-02-21
How does it work? X.c Y.cpp X.c Y.cpp X.c Y.cpp *.c *.cpp clang *.yaml SA *.yaml Phase 1: Collect statistics Ericsson Internal | 2018-02-21
How does it work? X.c Y.cpp X.c Y.cpp X.c Y.cpp *.c *.cpp clang clang *.yaml SA *.yaml SA Phase 1: Collect statistics Phase 2: Analyze Ericsson Internal | 2018-02-21
How does it work? X.c Y.cpp X.c Y.cpp X.c Y.cpp *.c *.cpp clang clang *.yaml SA *.yaml SA Phase 1: Collect statistics Phase 2: Analyze — Threshold and minimum required number of calls configurable (default: 85% and 10 calls) Ericsson Internal | 2018-02-21
How does it work? X.c Y.cpp X.c Y.cpp X.c Y.cpp *.c *.cpp clang clang *.yaml SA *.yaml SA Phase 1: Collect statistics Phase 2: Analyze — Threshold and minimum required number of calls configurable (default: 85% and 10 calls) — CodeChecker support exists, open sourcing planned Ericsson Internal | 2018-02-21
Checking for Special Return Values — No warnings, just state split Ericsson Internal | 2018-02-21
Checking for Special Return Values — No warnings, just state split int i = may_return_negative(); Ericsson Internal | 2018-02-21
Checking for Special Return Values — No warnings, just state split int i = may_return_negative(); i : [INT_MIN..-1] i: [0..INT_MAX] Ericsson Internal | 2018-02-21
Checking for Special Return Values — No warnings, just state split int i = may_return_negative(); i : [INT_MIN..-1] i: [0..INT_MAX] x = v[i]; //Error! x = v[i]; //OK. Ericsson Internal | 2018-02-21
Checking for Special Return Values — No warnings, just state split int i = may_return_negative(); i : [INT_MIN..-1] i: [0..INT_MAX] Reported by negative array ← indexing checker! x = v[i]; //Error! x = v[i]; //OK. Ericsson Internal | 2018-02-21
Checking for Special Return Values — No warnings, just state split int i = may_return_negative(); i : [INT_MIN..-1] i: [0..INT_MAX] Reported by negative array ← indexing checker! x = v[i]; //Error! x = v[i]; //OK. — Performance impact: low because special return value branch terminates quickly — Either by early exit or because of error Ericsson Internal | 2018-02-21
Future Work — False Positives: The possible return values often depend from the arguments Ericsson Internal | 2018-02-21
Future Work — False Positives: The possible return values often depend from the arguments — Solution: Take the parameters also into consideration Ericsson Internal | 2018-02-21
Thank You! adam.balogh@ericsson.com
Recommend
More recommend