styx/README.md
Christopher Talib 7163147a4f Pastebin nodes simple
Pastebin data is also sent to Dgraph and can be queried.
2020-05-19 10:10:42 +02:00

5.9 KiB

Styx

IMPORTANT

For development purposes, each time you restart Styx, the database and the schema is dropped. Currently, this is hardcoded and used to make development easier. Just so you know.

Prerequisites

Styx uses a couple of other services to run:

  • Kafka for messaging (not implemented yet in the docker, but currently not necessary)
  • Dgraph for graph representation of results
  • Docker-compose to launch everything

For that purposes, there is a docker-compose.yml file that you can spin up with the following command when in the directory:

docker-compose up -d

Note: for some reasons, OpenVPN blocks the establishment of the docker compose, you can alternatively run Dgraph manually as such:

docker run --rm -it -p 8080:8080 -p 9080:9080 -p 8000:8000 -v ~/dgraph:/dgraph dgraph/standalone:v20.03.0

Install

go get -u gitlab.dcso.lolcat/LABS/styx
cd $GOPATH/src/gitlab.dcso.lolcat/LABS/styx
go build
docker-compose up -d # or the other command
./styx

Note: if you have issues with the docker compose, make sure it runs on the same subnet. Check this for inspiration.

Example configuration:

certstream:
activated: true

pastebin:
activated: true

shodan:
activated: true
key: "SHODAN_KEY"
ports:
- 80
- 443

kafka:
activated: true
protocol: "tcp"
host: "localhost"
port: 9092
topic: "styx"
partition: 0

balboa:
# the url you tunneled to Balboa
url: http://127.0.0.1:8030
activated: true

elasticsearch:
activated: true
url: http://localhost:9200
index: "pastebin"

Dgraph Interface

You can connect to the Dgraph interface at this default address: localhost:8000. There you would be able to run GraphQL+ queries, here to query a node.

query {
    Node(func: eq(id, "node--cde8decb-0a8b-4d19-bd77-c2decb6dab9c")) {
        uid
            ndata
            modified
            type
            id
    }
}

Or filter node by type, this example works for certstream nodes:

query {
    Node(func: eq(type, "certstream")) {
        uid
            created
            modified
            type
            ndata
            certNode {
      uid
      fingerprint
      cn
      raw {
        uid
        id
      }
      chain {
        uid
        id
      }
      sourceName
      serialNumber
      basicConstrains
      notBefore
      notAfter
    }
    shodanNode {
        uid
            hostData {
                product
                ip
                version
                hostnames
                port
                html
            }
    }
  }
}

Example query for pastebin data:

query {
  Node(func: eq(type, "pastebin")) {
    uid
    created
    modified
    type
    ndata
    pasteNode {
      id
      type
      created
      modified
      fullPaste {
        full
        meta {
          full_url
          size
          expire
          title
          syntax
          user
          scrape_url
          date
          key
        }
      }
    }
  }
}

Datastructure

Meta

Node --[Edge]-- Node

type Node struct {
	ID       string `json:"id"`
	Type     string `json:"type"`
	Data     string `json:"data"` // For plain Node, the data is the ID of another typed node or a unique value like a domain or a host name.
	Created  string `json:"created"`
	Modified string `json:"modified"`
}

// Edge defines a relation between two nodes.
type Edge struct {
	ID        string `json:"id"`
	NodeOneID string `json:"nodeOneID"`
	NodeTwoID string `json:"nodeTwoID"`
	Timestamp string `json:"timestamp"`
	Source    string `json:"source"`
}

Certstream

Node --[Edge]-- CertNode --[Edge]-- CertStreamRaw Node(domain) --[Edge]-- CertNode


// CertStreamRaw is a wrapper around the stream function to unmarshall the
// data receive in a Go structure.
type CertStreamRaw struct {
	ID       string           `json:"id"`
	Type     string           `json:"type"`
	Data     CertStreamStruct `json:"data"`
	Created  string           `json:"created"`
	Modified string           `json:"modified"`
}

// CertNode represents our custom struct of data extraction from CertStream.
type CertNode struct {
	ID               string     `json:"id"`
	Fingerprint      string     `json:"fingerprint"`
	NotBefore        string     `json:"notBefore"`
	NotAfter         string     `json:"notAfter"`
	CN               string     `json:"cn"`
	SourceName       string     `json:"sourceName"`
	SerialNumber     string     `json:"serialNumber"`
	BasicConstraints string     `json:"basicConstraints"`
	RawUUID          string     `json:"rawUUID"`
	Chain            []CertNode `json:"chainedTo"`
}

Pastebin

Node --[Edge]-- PasteNode --[Edge]-- FullPaste

// PasteNode is a node from PasteBin.
type PasteNode struct {
	ID       string    `json:"id"`
	Type     string    `json:"type"`
	Data     FullPaste `json:"data"`
	Created  string    `json:"create"`
	Modified string    `json:"modified"`
}

// FullPaste wrapes meta and information from Pastebin.
type FullPaste struct {
	Meta PasteMeta `json:"meta"`
	Full string    `json:"full"`
}

Shodan

Node --[Edge]-- ShodanNode --[Edge]-- Node(s) (hostnames and domains)

type ShodanNode struct {
	ID       string           `json:"id"`
	Type     string           `json:"type"`
	Data     *shodan.HostData `json:"data"`
	Created  string           `json:"created"`
	Modified string           `json:"modified"`
}

Balboa

Balboa enrichment happens on domains and hostnames extracted from Certstream and Shodan streams and the node is created only if Balboa returns data.

Node --[Edge]-- ShodanNode --[Edge]-- Node (domain) --[Edge]-- BalboaNode

type BalboaNode struct {
	ID       string           `json:"id"`
	Type     string           `json:"type"`
	Data     []balboa.Entries `json:"data"`
	Created  string           `json:"created"`
	Modified string           `json:"modified"`
}