既存のdbからgraphqlのschemaを生成しようとしてみる

はじめに

既存のDBのURLを渡すと、何か良い感じにgraphqlのベースのapiを良い感じに提供してくれるようにする何かを作ろうとしはじめた。 graphqlはschemaを取るのだけれど、こちらのschemaはgraphベースなのでちょっと困る。サーバー側の実装をするためにはforeignkeyやrelationの情報を知りたいのでいきなりgraphql用のschemaを生成してはだめ。

そんなわけでschemaを作る手前段階の中間的なファイルを生成する。

その後、作った中間表現からgraphql用のschemaを生成してみる。

もくろみ

sqlalchemyのautomapの機能を使うとそれなりに手軽にできるのような気がした。

とりあえず以下の事が全部分かるようなデータを作ると良い。

存在するテーブルの情報
テーブルの持つフィールドの情報
テーブルの持つ関係(relation)の情報

やってみる

やってみた。あとでまじめに綺麗にするけれど。とりあえずのプロトタイプとしては良い感じ。

以下のようなtableを用意した。

CREATE TABLE childs (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT
);

CREATE TABLE kinds (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT NOT NULL
);


CREATE TABLE dogs (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    kind_id INTEGER,
    dog TEXT,
    FOREIGN KEY(kind_id) REFERENCES kinds(id)
);

CREATE TABLE child_dogs (
    child_id INTEGER,
    dog_id INTEGER,
    FOREIGN KEY(child_id) REFERENCES childs(id),
    FOREIGN KEY(dog_id) REFERENCES dogs(id)
);

dogsとchildsがchild_dogs経由でmany to many。dogとkindにfkが貼られている。 (database上ではone to oneとone to manyのどちらであるかは決められないので注意)

sqlite上でtableを生成して、生成したdbの情報をgen.pyというファイルに渡す(後述)。

$ cat create.sql | sqlite3 dog.db
$ python gen.py 'sqlite:///./dog.db'

以下のような中間表現を得られる。粗削りではあるけれど。

{
  "dogs": {
    "kinds": {
      "uselist": false,
      "direction": "MANYTOONE",
      "type": "kinds",
      "relation": {
        "to": "dogs.kind_id",
        "from": "kinds.id"
      }
    },
    "id": {
      "nullable": false,
      "type": "ID"
    },
    "kind_id": {
      "nullable": true,
      "type": "int"
    },
    "dog": {
      "nullable": true,
      "type": "str"
    },
    "childs_collection": {
      "uselist": true,
      "direction": "MANYTOMANY",
      "type": "childs",
      "relation": {
        "to": "child_dogs.dog_id",
        "from": "dogs.id"
      }
    }
  },
  "kinds": {
    "id": {
      "nullable": false,
      "type": "ID"
    },
    "name": {
      "nullable": false,
      "type": "str"
    },
    "dogs_collection": {
      "uselist": true,
      "direction": "ONETOMANY",
      "type": "dogs",
      "relation": {
        "to": "dogs.kind_id",
        "from": "kinds.id"
      }
    }
  },
  "childs": {
    "dogs_collection": {
      "uselist": true,
      "direction": "MANYTOMANY",
      "type": "dogs",
      "relation": {
        "to": "child_dogs.child_id",
        "from": "childs.id"
      }
    },
    "id": {
      "nullable": false,
      "type": "ID"
    },
    "name": {
      "nullable": true,
      "type": "str"
    }
  }
}

存在するテーブルの情報

dogs, childs, kindsのテーブルがあることが分かる

テーブルの持つフィールドの情報

typeがあるものがフィールド。foreign keyとして扱われるものはIDになっている。

テーブルの持つ関係(relation)の情報

relationのfromとtoがわかり、directionも分かるので良さそう。

gen.py

作ったスクリプトは以下の様な感じ。

from collections import OrderedDict
from sqlalchemy.ext.automap import automap_base
from sqlalchemy import create_engine
from sqlalchemy.inspection import inspect


def collect(classes, getname=str):
    d = OrderedDict()
    for c in classes:
        mapper = inspect(c)
        d[mapper.local_table.fullname] = _collect_from_mapper(mapper)
    return d


def _collect_from_mapper(m):
    d = OrderedDict()
    for prop in m.iterate_properties:
        if hasattr(prop, "direction"):
            pairs = prop.synchronize_pairs
            assert len(pairs) == 1, "multi keys are not supported"
            d[prop.key] = {
                "table": prop.target.fullname,
                "direction": prop.direction.name,
                "uselist": prop.uselist,
                "relation": {
                    "to": "{}.{}".format(pairs[0][0].table.fullname, pairs[0][0].name),
                    "from": "{}.{}".format(pairs[0][1].table.fullname, pairs[0][1].name),
                }
            }
        else:
            assert len(prop.columns) == 1, "multi keys are not supported"
            c = prop.columns[0]
            d[prop.key] = {
                "type": "ID" if c.primary_key else c.type.python_type.__name__,
                "nullable": c.nullable,
            }
    return d


def main(src):
    Base = automap_base()
    engine = create_engine(src)
    Base.prepare(engine, reflect=True)

    from dictknife import loading
    d = collect(Base.classes)
    loading.dumpfile(d, format="json")


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("--src", default="sqlite:///./dog.db")
    args = parser.parse_args()
    main(args.src)

中間表現からgraphqlのschemaを生成しようとしてみる

先程作った中間表現JSONファイルを利用してgraphqlのschemaを作ってみる。

やってみた結果

名前があんまりよろしくないけれど。以下の様な感じになる。

# gen.json は先程生成した中間表現のJSON
$ python convert.py gen.json

このような結果が得られる

type Child {
    dogs_collection: [Dog]
    id: ID!
    name: String
}
type Dog {
    kinds: Kind
    id: ID!
    kind_id: Integer
    dog: String
    childs_collection: [Child]
}
type Kind {
    id: ID!
    name: String!
    dogs_collection: [Dog]
}

テーブル名から型名を作っているので少し違和感のある名前かも知れない(手抜きをしたかったのでchildrenではなくchildsという名前だった)。まじめに調べていないので型の書き方が間違っているかもしれないけれど。とりあえずプロトタイプなので。

もう少し後でドキュメントなどを見直す必要がある。このあたり

そう言えば、unionとかenumには対応してない。

コード

コードはこんな感じ。 prestringとdictknifeが必要。

# -*- coding:utf-8 -*-
from dictknife import loading
from prestring import Module
import contextlib
import logging
logger = logging.getLogger(__name__)


def titleize(name):
    if not name:
        return name
    return name[0].upper() + name[1:]


def singular(name):
    if name.endswith("s"):
        return titleize(name[:-1])
    return titleize(name)


class Array:
    def __init__(self, t):
        self.t = t

    def __str__(self):
        return "[{}]".format(self.t)


class GraphQLModule(Module):
    @contextlib.contextmanager
    def type_(self, name):
        self.stmt("type {} {{", name)
        with self.scope():
            yield
        self.stmt("}")

    def field(self, name, typ, nullable=True):
        if nullable:
            self.stmt("{}: {}", name, typ)
        else:
            self.stmt("{}: {}!", name, typ)


def emit(m, d):
    for name, fields in d.items():
        with m.type_(singular(name)):
            for k, v in fields.items():
                if "type" in v:
                    m.field(k, v["type"], nullable=v.get("nullable", True))
                else:
                    if v["uselist"]:
                        m.field(k, Array(singular(v["table"])), nullable=v.get("nullable", True))
                    else:
                        m.field(k, singular(v["table"]), nullable=v.get("nullable", True))


def main(src):
    d = loading.loadfile(src)
    m = GraphQLModule()
    emit(m, d)
    print(m)


if __name__ == "__main__":
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("src")
    args = parser.parse_args()
    main(args.src)