Hi, All!
I'm making a syslog-parsing/analyzing system for catching hackers, who try to
access my branch routers. For now it's a "single-man-project" unpublished on
github or sourceforge, but it grows every day, new data and concepts are
constantly added. A bit more and I'll compile and publish it as a standard
linux package.
Need help in optimizing (cleaning the trash from a) postgres SQL query.
- Initial data
there's a table "swedro" with the following content:
rt | gw | ad | pt | dtst
---------------------+---------+----------------+-------+-----------------
2016-09-27 15:01:13 | SO70PUB | 0.0.0.0 | 5678 | 08:10:02.118134
2016-09-27 14:56:17 | SO70PUB | 0.0.0.0 | 0 | 08:10:02.118134
2016-09-27 14:59:13 | SO70PUB | 0.0.0.0 | 5678 | 08:10:02.118134
2016-09-27 14:57:15 | SO70PUB | 0.0.0.0 | 0 | 08:10:02.118134
2016-09-27 14:59:33 | SO70PUB | 0.0.0.0 | 0 | 08:10:02.118134
2016-09-27 14:55:11 | SO70PUB | 10.100.100.121 | 1221 | 08:10:02.118134
2016-09-27 14:57:14 | SO70PUB | 10.100.100.121 | 1221 | 08:10:02.118134
...next 15M tuples omitted
These are parsed syslog messages arrived from my routers. It's all about the
addresses of the "hackers" (ad) and the ports they looked for (pt).
- Goal
to find unique addresses, who looked only for the telnet (23) port, and put
them into another table, let's say, "vulnerable_candidates". It will be used
as a target list for nmap to find a single vulnerable port, and if true to an
expect/curl/telnet-script to try the factory credentials.
- Current state
I already tryed to do this with the array_agg() function, but I think there
is a more optimal and fast way to do this.
Here's my attempt:
select test.ad, test.freq, test.numz
from
( select list.ad,count(list.pt) as freq,array_agg(list.pt) as numz
from
( select ad,pt
from swedro
group by ad,pt
)
as list
group by list.ad
) as test
where test.freq=1 AND test.numz[1]=23
order by test.ad;
- Details
A simple "select ad from swedro where pt='23' group by addr" is not suitable.
It will catch everyone who scanned our telnet port - even those who scanned
afterwards other ports like 3389,443 or 80. I need those who scanned the one
and only 23-rd port and no others!
How shuld the qeury look like without array_agg() ?
P.S.: My sysop told me that this echo is a german language area, but the rules
are written in english. So please excuse me for english if it was a wrong
decision. If it is necessary I can write the same in german or russian.
Best regards, Maxim.
--- -Natural gas is hemispheric.
* Origin: We're concerned about AIDS in our White House... (2:5020/570.35)
|