prestringがasync/awaitをサポートした

github.com

prestringがasync/awaitをサポートした。

(そしてしれっとmypyで型をつけた(実はこちらのほうが大変だった))

prestring?

prestringは文字列を生成するためのライブラリ。with構文を乱用している。

例えば以下の様なコードが

00main.py

from prestring.python import Module

m = Module()

with m.def_("main"):
    m.stmt("print('hello world')")

with m.if_("__name__ == '__main__'"):
    m.stmt("main()")

print(m)

以下の様なコードを出力する。

def main():
    print('hello world')


if __name__ == '__main__':
    main()

なのでプロセス置換を使って実行してあげれば"hello world"を出力する。

$ python <(python 00main.py)
hello world

prestring.transform

自分の手で書くのがめんどうなら python -m prestring.pythonソースコードを渡してあげると、そのコードを生成するコードを出力してくれる。

例えば以下の様なコードを渡してみる。

01fizzbuzz.py

def fizzbuzz(n: int) -> str:
    if n % 3 == 0 and n % 5 == 0:
        return "fizzbuzz"
    elif n % 3 == 0:
        return "fizz"
    elif n % 5 == 0:
        return "buzz"
    else:
        return str(n)


if __name__ == "__main__":
    print(", ".join(fizzbuzz(i) for i in range(1, 21)))

このコードを生成するコードを出力してくれる。

$ python -m prestring.python 01fizzbuzz.py
from prestring.python import PythonModule


def gen(*, m=None, indent='    '):
    m = m or PythonModule(indent=indent)

    with m.def_('fizzbuzz', 'n: int', return_type='str'):
        with m.if_('n % 3 == 0 and n % 5 == 0'):
            m.stmt('return "fizzbuzz"')
        with m.elif_('n % 3 == 0'):
            m.stmt('return "fizz"')
        with m.elif_('n % 5 == 0'):
            m.stmt('return "buzz"')
        with m.else_():
            m.stmt('return str(n)')

    with m.if_('__name__ == "__main__"'):
        m.stmt('print(", ".join(fizzbuzz(i) for i in range(1, 21)))')
    return m


if __name__ == "__main__":
    m = gen(indent='    ')
    print(m)

もちろんこのコードが出力したコードを実行すれば元のコードのように振る舞う。

$ python <(python <(python -m prestring.python 01fizzbuzz.py))
1, 2, fizz, 4, buzz, fizz, 7, 8, fizz, buzz, 11, fizz, 13, 14, fizzbuzz, 16, 17, fizz, 19, buzz

async/await 対応

ここからがようやく本題。そんなprestringだけれど。今まではasync/awaitの構文をサポートしていなかった。 これを今回サポートしたという話。例えば以下の様なコードを書く。オプションとして async_=True を付けるとasync defやasync forなどとして扱われる。

例えば、pythonのドキュメントのfutureの部分のコードは以下の様な形で書ける。

from prestring.python import PythonModule as Module


m = Module()

m.import_("asyncio")
m.stmt("# from: https://docs.python.org/ja/3/library/asyncio-future.html")
m.sep()

with m.def_("set_after", "fut", "delay", "value", async_=True):
    m.stmt("# Sleep for *delay* seconds.")
    m.stmt("await asyncio.sleep(delay)")

    m.stmt("# Set *value* as a result of *fut* Future.")
    m.stmt("fut.set_result(value)")

with m.def_("main", async_=True):
    m.stmt("# Get the current event loop.")
    m.stmt("loop = asyncio.get_running_loop()")

    m.stmt("# Create a new Future object.")
    m.stmt("fut = loop.create_future()")

    m.stmt('# Run "set_after()" coroutine in a parallel Task.')
    m.stmt('# We are using the low-level "loop.create_task()" API here because')
    m.stmt("# we already have a reference to the event loop at hand.")
    m.stmt('# Otherwise we could have just used "asyncio.create_task()".')

    m.stmt('loop.create_task(set_after(fut, 1, "... world"))')
    m.stmt('print("hello ...")')

    m.stmt("# Wait until *fut* has a result (1 second) and print it.")
    m.stmt("print(await fut)")

m.stmt("asyncio.run(main())")
print(m)

実行結果

$ python 02asyncio-future.py
import asyncio
# from: https://docs.python.org/ja/3/library/asyncio-future.html


async def set_after(fut, delay, value):
    # Sleep for *delay* seconds.
    await asyncio.sleep(delay)
    # Set *value* as a result of *fut* Future.
    fut.set_result(value)


async def main():
    # Get the current event loop.
    loop = asyncio.get_running_loop()
    # Create a new Future object.
    fut = loop.create_future()
    # Run "set_after()" coroutine in a parallel Task.
    # We are using the low-level "loop.create_task()" API here because
    # we already have a reference to the event loop at hand.
    # Otherwise we could have just used "asyncio.create_task()".
    loop.create_task(set_after(fut, 1, "... world"))
    print("hello ...")
    # Wait until *fut* has a result (1 second) and print it.
    print(await fut)


asyncio.run(main())

async for, async with

もちろんasync for, async withにも対応している。そして先程の python -m prestring.python も完全に手書きしたものと同じにはならないがasync/awaitが使われているコードも対応している。

例えばこういうトリビアルなコードがあったときに

aexecutor.py

import typing as t
import asyncio
from functools import partial


class AExecutor:
    def __init__(self):
        self.q = asyncio.Queue()

    async def __aenter__(self) -> t.Callable[..., t.Awaitable[None]]:
        async def loop():
            while True:
                afn = await self.q.get()
                if afn is None:
                    self.q.task_done()
                    break
                await afn()
                self.q.task_done()

        asyncio.ensure_future(loop())
        return self.q.put

    async def __aexit__(
        self,
        exc: t.Optional[t.Type[BaseException]],
        value: t.Optional[BaseException],
        tb: t.Any,
    ):
        await self.q.put(None)
        await self.q.join()


async def arange(n: int) -> t.AsyncIterator[int]:
    for i in range(n):
        yield i
        await asyncio.sleep(0.1)


async def run():
    async with AExecutor() as submit:

        async def task(tag: str):
            async for i in arange(3):
                print(tag, i)

        await submit(partial(task, "x"))
        await submit(partial(task, "y"))


def main():
    asyncio.run(run())


if __name__ == "__main__":
    main()

python -m prestring.python は以下の様なコードを生成する。

$ python -m prestring.python aexecutor.py
from prestring.python import PythonModule


def gen(*, m=None, indent='    '):
    m = m or PythonModule(indent=indent)

    m.import_('typing as t')
    m.import_('asyncio')
    m.from_('functools', 'partial')
    with m.class_('AExecutor'):
        with m.def_('__init__', 'self'):
            m.stmt('self.q = asyncio.Queue()')

        with m.def_('__aenter__', 'self', return_type='t.Callable[..., t.Awaitable[None]]', async_=True):
            with m.def_('loop', async_=True):
                with m.while_('True'):
                    m.stmt('afn = await self.q.get()')
                    with m.if_('afn is None'):
                        m.stmt('self.q.task_done()')
                        m.stmt('break')
                    m.stmt('await afn()')
                    m.stmt('self.q.task_done()')

            m.stmt('asyncio.ensure_future(loop())')
            m.stmt('return self.q.put')

        with m.def_('__aexit__', 'self', 'exc: t.Optional[t.Type[BaseException]]', 'value: t.Optional[BaseException]', 'tb: t.Any', async_=True):
            m.stmt('await self.q.put(None)')
            m.stmt('await self.q.join()')


    with m.def_('arange', 'n: int', return_type='t.AsyncIterator[int]', async_=True):
        with m.for_('i in range(n)'):
            m.stmt('yield i')
            m.stmt('await asyncio.sleep(0.1)')

    with m.def_('run', async_=True):
        with m.with_('AExecutor() as submit', async_=True):
            with m.def_('task', 'tag: str', async_=True):
                with m.for_('i in arange(3)', async_=True):
                    m.stmt('print(tag, i)')

            m.stmt('await submit(partial(task, "x"))')
            m.stmt('await submit(partial(task, "y"))')

    with m.def_('main'):
        m.stmt('asyncio.run(run())')

    with m.if_('__name__ == "__main__"'):
        m.stmt('main()')
    return m


if __name__ == "__main__":
    m = gen(indent='    ')
    print(m)

もちろん出力結果は正しいpythonコードなので実行もできる。

$ python <(python <(python -m prestring.python aexecutor.py))
x 0
x 1
x 2
y 0
y 1
y 2

おまけ prestring.python.parse

ついでに python -m prestring.python のコードを書いていてlib2to3用のASTを確認するのがめんどうだったので python -m prestring.python.parse でASTをdumpする機能もつけてみた。

$ python -m prestring.python.parse aexecutor.py
file_input [9 children]
  simple_stmt [2 children]
    import_name [2 children]
      NAME('import') [lineno=1, column=0, prefix='']
      dotted_as_name [3 children]
        NAME('typing') [lineno=1, column=7, prefix=' ']
        NAME('as') [lineno=1, column=14, prefix=' ']
        NAME('t') [lineno=1, column=17, prefix=' ']
    NEWLINE('\n') [lineno=1, column=18, prefix='']
  simple_stmt [2 children]
    import_name [2 children]
      NAME('import') [lineno=2, column=0, prefix='']
      NAME('asyncio') [lineno=2, column=7, prefix=' ']
    NEWLINE('\n') [lineno=2, column=14, prefix='']
  simple_stmt [2 children]
    import_from [4 children]
      NAME('from') [lineno=3, column=0, prefix='']
      NAME('functools') [lineno=3, column=5, prefix=' ']
      NAME('import') [lineno=3, column=15, prefix=' ']
      NAME('partial') [lineno=3, column=22, prefix=' ']
    NEWLINE('\n') [lineno=3, column=29, prefix='']
  classdef[name='AExecutor'] [4 children]
    NAME('class') [lineno=6, column=0, prefix='\n\n']
    NAME('AExecutor') [lineno=6, column=6, prefix=' ']
    COLON(':') [lineno=6, column=15, prefix='']
    suite [6 children]
      NEWLINE('\n') [lineno=6, column=16, prefix='']
      INDENT('    ') [lineno=7, column=0, prefix='']
      funcdef[name='__init__'] [5 children]
        NAME('def') [lineno=7, column=4, prefix='']
        NAME('__init__') [lineno=7, column=8, prefix=' ']
        parameters [3 children]
          LPAR('(') [lineno=7, column=16, prefix='']
          NAME('self') [lineno=7, column=17, prefix='']
          RPAR(')') [lineno=7, column=21, prefix='']
        COLON(':') [lineno=7, column=22, prefix='']
        suite [4 children]
          NEWLINE('\n') [lineno=7, column=23, prefix='']
          INDENT('        ') [lineno=8, column=0, prefix='']
          simple_stmt [2 children]
            expr_stmt [3 children]
              power [2 children]
                NAME('self') [lineno=8, column=8, prefix='']
                trailer [2 children]
                  DOT('.') [lineno=8, column=12, prefix='']
                  NAME('q') [lineno=8, column=13, prefix='']
              EQUAL('=') [lineno=8, column=15, prefix=' ']
              power [3 children]
                NAME('asyncio') [lineno=8, column=17, prefix=' ']
                trailer [2 children]
                  DOT('.') [lineno=8, column=24, prefix='']
                  NAME('Queue') [lineno=8, column=25, prefix='']
                trailer [2 children]
                  LPAR('(') [lineno=8, column=30, prefix='']
                  RPAR(')') [lineno=8, column=31, prefix='']
            NEWLINE('\n') [lineno=8, column=32, prefix='']
          DEDENT('') [lineno=10, column=4, prefix='\n    ']
      async_stmt [2 children]
        ASYNC('async') [lineno=10, column=4, prefix='']
        funcdef[name='__aenter__'] [7 children]
          NAME('def') [lineno=10, column=10, prefix=' ']
          NAME('__aenter__') [lineno=10, column=14, prefix=' ']
          parameters [3 children]
            LPAR('(') [lineno=10, column=24, prefix='']
            NAME('self') [lineno=10, column=25, prefix='']
            RPAR(')') [lineno=10, column=29, prefix='']
          RARROW('->') [lineno=10, column=31, prefix=' ']
          power [3 children]
            NAME('t') [lineno=10, column=34, prefix=' ']
            trailer [2 children]
              DOT('.') [lineno=10, column=35, prefix='']
              NAME('Callable') [lineno=10, column=36, prefix='']
            trailer [3 children]
              LSQB('[') [lineno=10, column=44, prefix='']
              subscriptlist [3 children]
                atom [3 children]
                  DOT('.') [lineno=10, column=45, prefix='']
                  DOT('.') [lineno=10, column=46, prefix='']
                  DOT('.') [lineno=10, column=47, prefix='']
                COMMA(',') [lineno=10, column=48, prefix='']
                power [3 children]
                  NAME('t') [lineno=10, column=50, prefix=' ']
                  trailer [2 children]
                    DOT('.') [lineno=10, column=51, prefix='']
                    NAME('Awaitable') [lineno=10, column=52, prefix='']
                  trailer [3 children]
                    LSQB('[') [lineno=10, column=61, prefix='']
                    NAME('None') [lineno=10, column=62, prefix='']
                    RSQB(']') [lineno=10, column=66, prefix='']
              RSQB(']') [lineno=10, column=67, prefix='']
          COLON(':') [lineno=10, column=68, prefix='']
          suite [6 children]
            NEWLINE('\n') [lineno=10, column=69, prefix='']
            INDENT('        ') [lineno=11, column=0, prefix='']
            async_stmt [2 children]
              ASYNC('async') [lineno=11, column=8, prefix='']
              funcdef[name='loop'] [5 children]
                NAME('def') [lineno=11, column=14, prefix=' ']
                NAME('loop') [lineno=11, column=18, prefix=' ']
                parameters [2 children]
                  LPAR('(') [lineno=11, column=22, prefix='']
                  RPAR(')') [lineno=11, column=23, prefix='']
                COLON(':') [lineno=11, column=24, prefix='']
                suite [4 children]
                  NEWLINE('\n') [lineno=11, column=25, prefix='']
                  INDENT('            ') [lineno=12, column=0, prefix='']
                  while_stmt [4 children]
                    NAME('while') [lineno=12, column=12, prefix='']
                    NAME('True') [lineno=12, column=18, prefix=' ']
                    COLON(':') [lineno=12, column=22, prefix='']
                    suite [7 children]
                      NEWLINE('\n') [lineno=12, column=23, prefix='']
                      INDENT('                ') [lineno=13, column=0, prefix='']
                      simple_stmt [2 children]
                        expr_stmt [3 children]
                          NAME('afn') [lineno=13, column=16, prefix='']
                          EQUAL('=') [lineno=13, column=20, prefix=' ']
                          power [5 children]
                            AWAIT('await') [lineno=13, column=22, prefix=' ']
                            NAME('self') [lineno=13, column=28, prefix=' ']
                            trailer [2 children]
                              DOT('.') [lineno=13, column=32, prefix='']
                              NAME('q') [lineno=13, column=33, prefix='']
                            trailer [2 children]
                              DOT('.') [lineno=13, column=34, prefix='']
                              NAME('get') [lineno=13, column=35, prefix='']
                            trailer [2 children]
                              LPAR('(') [lineno=13, column=38, prefix='']
                              RPAR(')') [lineno=13, column=39, prefix='']
                        NEWLINE('\n') [lineno=13, column=40, prefix='']
                      if_stmt [4 children]
                        NAME('if') [lineno=14, column=16, prefix='                ']
                        comparison [3 children]
                          NAME('afn') [lineno=14, column=19, prefix=' ']
                          NAME('is') [lineno=14, column=23, prefix=' ']
                          NAME('None') [lineno=14, column=26, prefix=' ']
                        COLON(':') [lineno=14, column=30, prefix='']
                        suite [5 children]
                          NEWLINE('\n') [lineno=14, column=31, prefix='']
                          INDENT('                    ') [lineno=15, column=0, prefix='']
                          simple_stmt [2 children]
                            power [4 children]
                              NAME('self') [lineno=15, column=20, prefix='']
                              trailer [2 children]
                                DOT('.') [lineno=15, column=24, prefix='']
                                NAME('q') [lineno=15, column=25, prefix='']
                              trailer [2 children]
                                DOT('.') [lineno=15, column=26, prefix='']
                                NAME('task_done') [lineno=15, column=27, prefix='']
                              trailer [2 children]
                                LPAR('(') [lineno=15, column=36, prefix='']
                                RPAR(')') [lineno=15, column=37, prefix='']
                            NEWLINE('\n') [lineno=15, column=38, prefix='']
                          simple_stmt [2 children]
                            NAME('break') [lineno=16, column=20, prefix='                    ']
                            NEWLINE('\n') [lineno=16, column=25, prefix='']
                          DEDENT('') [lineno=17, column=16, prefix='                ']
                      simple_stmt [2 children]
                        power [3 children]
                          AWAIT('await') [lineno=17, column=16, prefix='']
                          NAME('afn') [lineno=17, column=22, prefix=' ']
                          trailer [2 children]
                            LPAR('(') [lineno=17, column=25, prefix='']
                            RPAR(')') [lineno=17, column=26, prefix='']
                        NEWLINE('\n') [lineno=17, column=27, prefix='']
                      simple_stmt [2 children]
                        power [4 children]
                          NAME('self') [lineno=18, column=16, prefix='                ']
                          trailer [2 children]
                            DOT('.') [lineno=18, column=20, prefix='']
                            NAME('q') [lineno=18, column=21, prefix='']
                          trailer [2 children]
                            DOT('.') [lineno=18, column=22, prefix='']
                            NAME('task_done') [lineno=18, column=23, prefix='']
                          trailer [2 children]
                            LPAR('(') [lineno=18, column=32, prefix='']
                            RPAR(')') [lineno=18, column=33, prefix='']
                        NEWLINE('\n') [lineno=18, column=34, prefix='']
                      DEDENT('') [lineno=20, column=8, prefix='\n        ']
                  DEDENT('') [lineno=20, column=8, prefix='']
            simple_stmt [2 children]
              power [3 children]
                NAME('asyncio') [lineno=20, column=8, prefix='']
                trailer [2 children]
                  DOT('.') [lineno=20, column=15, prefix='']
                  NAME('ensure_future') [lineno=20, column=16, prefix='']
                trailer [3 children]
                  LPAR('(') [lineno=20, column=29, prefix='']
                  power [2 children]
                    NAME('loop') [lineno=20, column=30, prefix='']
                    trailer [2 children]
                      LPAR('(') [lineno=20, column=34, prefix='']
                      RPAR(')') [lineno=20, column=35, prefix='']
                  RPAR(')') [lineno=20, column=36, prefix='']
              NEWLINE('\n') [lineno=20, column=37, prefix='']
            simple_stmt [2 children]
              return_stmt [2 children]
                NAME('return') [lineno=21, column=8, prefix='        ']
                power [3 children]
                  NAME('self') [lineno=21, column=15, prefix=' ']
                  trailer [2 children]
                    DOT('.') [lineno=21, column=19, prefix='']
                    NAME('q') [lineno=21, column=20, prefix='']
                  trailer [2 children]
                    DOT('.') [lineno=21, column=21, prefix='']
                    NAME('put') [lineno=21, column=22, prefix='']
              NEWLINE('\n') [lineno=21, column=25, prefix='']
            DEDENT('') [lineno=23, column=4, prefix='\n    ']
      async_stmt [2 children]
        ASYNC('async') [lineno=23, column=4, prefix='']
        funcdef[name='__aexit__'] [5 children]
          NAME('def') [lineno=23, column=10, prefix=' ']
          NAME('__aexit__') [lineno=23, column=14, prefix=' ']
          parameters [3 children]
            LPAR('(') [lineno=23, column=23, prefix='']
            typedargslist[args='self' ',' 'tname' ',' 'tname' ',' 'tname' ','] [8 children]
              NAME('self') [lineno=24, column=8, prefix='\n        ']
              COMMA(',') [lineno=24, column=12, prefix='']
              tname [3 children]
                NAME('exc') [lineno=25, column=8, prefix='\n        ']
                COLON(':') [lineno=25, column=11, prefix='']
                power [3 children]
                  NAME('t') [lineno=25, column=13, prefix=' ']
                  trailer [2 children]
                    DOT('.') [lineno=25, column=14, prefix='']
                    NAME('Optional') [lineno=25, column=15, prefix='']
                  trailer [3 children]
                    LSQB('[') [lineno=25, column=23, prefix='']
                    power [3 children]
                      NAME('t') [lineno=25, column=24, prefix='']
                      trailer [2 children]
                        DOT('.') [lineno=25, column=25, prefix='']
                        NAME('Type') [lineno=25, column=26, prefix='']
                      trailer [3 children]
                        LSQB('[') [lineno=25, column=30, prefix='']
                        NAME('BaseException') [lineno=25, column=31, prefix='']
                        RSQB(']') [lineno=25, column=44, prefix='']
                    RSQB(']') [lineno=25, column=45, prefix='']
              COMMA(',') [lineno=25, column=46, prefix='']
              tname [3 children]
                NAME('value') [lineno=26, column=8, prefix='\n        ']
                COLON(':') [lineno=26, column=13, prefix='']
                power [3 children]
                  NAME('t') [lineno=26, column=15, prefix=' ']
                  trailer [2 children]
                    DOT('.') [lineno=26, column=16, prefix='']
                    NAME('Optional') [lineno=26, column=17, prefix='']
                  trailer [3 children]
                    LSQB('[') [lineno=26, column=25, prefix='']
                    NAME('BaseException') [lineno=26, column=26, prefix='']
                    RSQB(']') [lineno=26, column=39, prefix='']
              COMMA(',') [lineno=26, column=40, prefix='']
              tname [3 children]
                NAME('tb') [lineno=27, column=8, prefix='\n        ']
                COLON(':') [lineno=27, column=10, prefix='']
                power [2 children]
                  NAME('t') [lineno=27, column=12, prefix=' ']
                  trailer [2 children]
                    DOT('.') [lineno=27, column=13, prefix='']
                    NAME('Any') [lineno=27, column=14, prefix='']
              COMMA(',') [lineno=27, column=17, prefix='']
            RPAR(')') [lineno=28, column=4, prefix='\n    ']
          COLON(':') [lineno=28, column=5, prefix='']
          suite [5 children]
            NEWLINE('\n') [lineno=28, column=6, prefix='']
            INDENT('        ') [lineno=29, column=0, prefix='']
            simple_stmt [2 children]
              power [5 children]
                AWAIT('await') [lineno=29, column=8, prefix='']
                NAME('self') [lineno=29, column=14, prefix=' ']
                trailer [2 children]
                  DOT('.') [lineno=29, column=18, prefix='']
                  NAME('q') [lineno=29, column=19, prefix='']
                trailer [2 children]
                  DOT('.') [lineno=29, column=20, prefix='']
                  NAME('put') [lineno=29, column=21, prefix='']
                trailer [3 children]
                  LPAR('(') [lineno=29, column=24, prefix='']
                  NAME('None') [lineno=29, column=25, prefix='']
                  RPAR(')') [lineno=29, column=29, prefix='']
              NEWLINE('\n') [lineno=29, column=30, prefix='']
            simple_stmt [2 children]
              power [5 children]
                AWAIT('await') [lineno=30, column=8, prefix='        ']
                NAME('self') [lineno=30, column=14, prefix=' ']
                trailer [2 children]
                  DOT('.') [lineno=30, column=18, prefix='']
                  NAME('q') [lineno=30, column=19, prefix='']
                trailer [2 children]
                  DOT('.') [lineno=30, column=20, prefix='']
                  NAME('join') [lineno=30, column=21, prefix='']
                trailer [2 children]
                  LPAR('(') [lineno=30, column=25, prefix='']
                  RPAR(')') [lineno=30, column=26, prefix='']
              NEWLINE('\n') [lineno=30, column=27, prefix='']
            DEDENT('') [lineno=33, column=0, prefix='\n\n']
      DEDENT('') [lineno=33, column=0, prefix='']
  async_stmt [2 children]
    ASYNC('async') [lineno=33, column=0, prefix='']
    funcdef[name='arange'] [7 children]
      NAME('def') [lineno=33, column=6, prefix=' ']
      NAME('arange') [lineno=33, column=10, prefix=' ']
      parameters [3 children]
        LPAR('(') [lineno=33, column=16, prefix='']
        tname [3 children]
          NAME('n') [lineno=33, column=17, prefix='']
          COLON(':') [lineno=33, column=18, prefix='']
          NAME('int') [lineno=33, column=20, prefix=' ']
        RPAR(')') [lineno=33, column=23, prefix='']
      RARROW('->') [lineno=33, column=25, prefix=' ']
      power [3 children]
        NAME('t') [lineno=33, column=28, prefix=' ']
        trailer [2 children]
          DOT('.') [lineno=33, column=29, prefix='']
          NAME('AsyncIterator') [lineno=33, column=30, prefix='']
        trailer [3 children]
          LSQB('[') [lineno=33, column=43, prefix='']
          NAME('int') [lineno=33, column=44, prefix='']
          RSQB(']') [lineno=33, column=47, prefix='']
      COLON(':') [lineno=33, column=48, prefix='']
      suite [4 children]
        NEWLINE('\n') [lineno=33, column=49, prefix='']
        INDENT('    ') [lineno=34, column=0, prefix='']
        for_stmt [6 children]
          NAME('for') [lineno=34, column=4, prefix='']
          NAME('i') [lineno=34, column=8, prefix=' ']
          NAME('in') [lineno=34, column=10, prefix=' ']
          power [2 children]
            NAME('range') [lineno=34, column=13, prefix=' ']
            trailer [3 children]
              LPAR('(') [lineno=34, column=18, prefix='']
              NAME('n') [lineno=34, column=19, prefix='']
              RPAR(')') [lineno=34, column=20, prefix='']
          COLON(':') [lineno=34, column=21, prefix='']
          suite [5 children]
            NEWLINE('\n') [lineno=34, column=22, prefix='']
            INDENT('        ') [lineno=35, column=0, prefix='']
            simple_stmt [2 children]
              yield_expr [2 children]
                NAME('yield') [lineno=35, column=8, prefix='']
                NAME('i') [lineno=35, column=14, prefix=' ']
              NEWLINE('\n') [lineno=35, column=15, prefix='']
            simple_stmt [2 children]
              power [4 children]
                AWAIT('await') [lineno=36, column=8, prefix='        ']
                NAME('asyncio') [lineno=36, column=14, prefix=' ']
                trailer [2 children]
                  DOT('.') [lineno=36, column=21, prefix='']
                  NAME('sleep') [lineno=36, column=22, prefix='']
                trailer [3 children]
                  LPAR('(') [lineno=36, column=27, prefix='']
                  NUMBER('0.1') [lineno=36, column=28, prefix='']
                  RPAR(')') [lineno=36, column=31, prefix='']
              NEWLINE('\n') [lineno=36, column=32, prefix='']
            DEDENT('') [lineno=39, column=0, prefix='\n\n']
        DEDENT('') [lineno=39, column=0, prefix='']
  async_stmt [2 children]
    ASYNC('async') [lineno=39, column=0, prefix='']
    funcdef[name='run'] [5 children]
      NAME('def') [lineno=39, column=6, prefix=' ']
      NAME('run') [lineno=39, column=10, prefix=' ']
      parameters [2 children]
        LPAR('(') [lineno=39, column=13, prefix='']
        RPAR(')') [lineno=39, column=14, prefix='']
      COLON(':') [lineno=39, column=15, prefix='']
      suite [4 children]
        NEWLINE('\n') [lineno=39, column=16, prefix='']
        INDENT('    ') [lineno=40, column=0, prefix='']
        async_stmt [2 children]
          ASYNC('async') [lineno=40, column=4, prefix='']
          with_stmt [4 children]
            NAME('with') [lineno=40, column=10, prefix=' ']
            with_item [3 children]
              power [2 children]
                NAME('AExecutor') [lineno=40, column=15, prefix=' ']
                trailer [2 children]
                  LPAR('(') [lineno=40, column=24, prefix='']
                  RPAR(')') [lineno=40, column=25, prefix='']
              NAME('as') [lineno=40, column=27, prefix=' ']
              NAME('submit') [lineno=40, column=30, prefix=' ']
            COLON(':') [lineno=40, column=36, prefix='']
            suite [6 children]
              NEWLINE('\n') [lineno=40, column=37, prefix='']
              INDENT('        ') [lineno=42, column=0, prefix='\n']
              async_stmt [2 children]
                ASYNC('async') [lineno=42, column=8, prefix='']
                funcdef[name='task'] [5 children]
                  NAME('def') [lineno=42, column=14, prefix=' ']
                  NAME('task') [lineno=42, column=18, prefix=' ']
                  parameters [3 children]
                    LPAR('(') [lineno=42, column=22, prefix='']
                    tname [3 children]
                      NAME('tag') [lineno=42, column=23, prefix='']
                      COLON(':') [lineno=42, column=26, prefix='']
                      NAME('str') [lineno=42, column=28, prefix=' ']
                    RPAR(')') [lineno=42, column=31, prefix='']
                  COLON(':') [lineno=42, column=32, prefix='']
                  suite [4 children]
                    NEWLINE('\n') [lineno=42, column=33, prefix='']
                    INDENT('            ') [lineno=43, column=0, prefix='']
                    async_stmt [2 children]
                      ASYNC('async') [lineno=43, column=12, prefix='']
                      for_stmt [6 children]
                        NAME('for') [lineno=43, column=18, prefix=' ']
                        NAME('i') [lineno=43, column=22, prefix=' ']
                        NAME('in') [lineno=43, column=24, prefix=' ']
                        power [2 children]
                          NAME('arange') [lineno=43, column=27, prefix=' ']
                          trailer [3 children]
                            LPAR('(') [lineno=43, column=33, prefix='']
                            NUMBER('3') [lineno=43, column=34, prefix='']
                            RPAR(')') [lineno=43, column=35, prefix='']
                        COLON(':') [lineno=43, column=36, prefix='']
                        suite [4 children]
                          NEWLINE('\n') [lineno=43, column=37, prefix='']
                          INDENT('                ') [lineno=44, column=0, prefix='']
                          simple_stmt [2 children]
                            power [2 children]
                              NAME('print') [lineno=44, column=16, prefix='']
                              trailer [3 children]
                                LPAR('(') [lineno=44, column=21, prefix='']
                                arglist [3 children]
                                  NAME('tag') [lineno=44, column=22, prefix='']
                                  COMMA(',') [lineno=44, column=25, prefix='']
                                  NAME('i') [lineno=44, column=27, prefix=' ']
                                RPAR(')') [lineno=44, column=28, prefix='']
                            NEWLINE('\n') [lineno=44, column=29, prefix='']
                          DEDENT('') [lineno=46, column=8, prefix='\n        ']
                    DEDENT('') [lineno=46, column=8, prefix='']
              simple_stmt [2 children]
                power [3 children]
                  AWAIT('await') [lineno=46, column=8, prefix='']
                  NAME('submit') [lineno=46, column=14, prefix=' ']
                  trailer [3 children]
                    LPAR('(') [lineno=46, column=20, prefix='']
                    power [2 children]
                      NAME('partial') [lineno=46, column=21, prefix='']
                      trailer [3 children]
                        LPAR('(') [lineno=46, column=28, prefix='']
                        arglist [3 children]
                          NAME('task') [lineno=46, column=29, prefix='']
                          COMMA(',') [lineno=46, column=33, prefix='']
                          STRING('"x"') [lineno=46, column=35, prefix=' ']
                        RPAR(')') [lineno=46, column=38, prefix='']
                    RPAR(')') [lineno=46, column=39, prefix='']
                NEWLINE('\n') [lineno=46, column=40, prefix='']
              simple_stmt [2 children]
                power [3 children]
                  AWAIT('await') [lineno=47, column=8, prefix='        ']
                  NAME('submit') [lineno=47, column=14, prefix=' ']
                  trailer [3 children]
                    LPAR('(') [lineno=47, column=20, prefix='']
                    power [2 children]
                      NAME('partial') [lineno=47, column=21, prefix='']
                      trailer [3 children]
                        LPAR('(') [lineno=47, column=28, prefix='']
                        arglist [3 children]
                          NAME('task') [lineno=47, column=29, prefix='']
                          COMMA(',') [lineno=47, column=33, prefix='']
                          STRING('"y"') [lineno=47, column=35, prefix=' ']
                        RPAR(')') [lineno=47, column=38, prefix='']
                    RPAR(')') [lineno=47, column=39, prefix='']
                NEWLINE('\n') [lineno=47, column=40, prefix='']
              DEDENT('') [lineno=50, column=0, prefix='\n\n']
        DEDENT('') [lineno=50, column=0, prefix='']
  funcdef[name='main'] [5 children]
    NAME('def') [lineno=50, column=0, prefix='']
    NAME('main') [lineno=50, column=4, prefix=' ']
    parameters [2 children]
      LPAR('(') [lineno=50, column=8, prefix='']
      RPAR(')') [lineno=50, column=9, prefix='']
    COLON(':') [lineno=50, column=10, prefix='']
    suite [4 children]
      NEWLINE('\n') [lineno=50, column=11, prefix='']
      INDENT('    ') [lineno=51, column=0, prefix='']
      simple_stmt [2 children]
        power [3 children]
          NAME('asyncio') [lineno=51, column=4, prefix='']
          trailer [2 children]
            DOT('.') [lineno=51, column=11, prefix='']
            NAME('run') [lineno=51, column=12, prefix='']
          trailer [3 children]
            LPAR('(') [lineno=51, column=15, prefix='']
            power [2 children]
              NAME('run') [lineno=51, column=16, prefix='']
              trailer [2 children]
                LPAR('(') [lineno=51, column=19, prefix='']
                RPAR(')') [lineno=51, column=20, prefix='']
            RPAR(')') [lineno=51, column=21, prefix='']
        NEWLINE('\n') [lineno=51, column=22, prefix='']
      DEDENT('') [lineno=54, column=0, prefix='\n\n']
  if_stmt [4 children]
    NAME('if') [lineno=54, column=0, prefix='']
    comparison [3 children]
      NAME('__name__') [lineno=54, column=3, prefix=' ']
      EQEQUAL('==') [lineno=54, column=12, prefix=' ']
      STRING('"__main__"') [lineno=54, column=15, prefix=' ']
    COLON(':') [lineno=54, column=25, prefix='']
    suite [4 children]
      NEWLINE('\n') [lineno=54, column=26, prefix='']
      INDENT('    ') [lineno=55, column=0, prefix='']
      simple_stmt [2 children]
        power [2 children]
          NAME('main') [lineno=55, column=4, prefix='']
          trailer [2 children]
            LPAR('(') [lineno=55, column=8, prefix='']
            RPAR(')') [lineno=55, column=9, prefix='']
        NEWLINE('\n') [lineno=55, column=10, prefix='']
      DEDENT('') [lineno=56, column=0, prefix='']
  ENDMARKER('') [lineno=56, column=0, prefix='']

おしまい。

gist

gistは以下

goで設定ファイルを読み込むときに上書きしたい

goで設定ファイルを読み込むときに設定の一部だけを上書きしたいことがある。それ用のコードのメモ。

github.com

mergo?

mergoというのはけっこう古くからあるライブラリみたい。

A helper to merge structs and maps in Golang. Useful for configuration default values, avoiding messy if-statements.

2つのデータをいい感じにマージしてくれる。

振る舞いについては以下を見ると良い

雑にまとめると mergo.Merge(<dst>, <src>)

  • unexported fieldは無視
  • zero valueは無視

という条件で再帰的に上書きしていく。<src>自体は書き換わらずに<dst>が書き換わる。

試しに使ってみる

実際に設定の上書きを試してみる(意外とコード例までたどり着くまでが長かったのでコード例の部分だけが知りたい場合にはgistを直接見たほうが早いかもしれない)。

やりたいことの整理

設定ファイルを上書きする例を無理やりひねり出す。

テキトーにネストした形状になっている設定ファイルを探してくる。docker composeの例などが便利かもしれない。

docker-compose.yml

version: "3.7"

services:
  wordpress:
    image: wordpress
    ports:
      - "8080:80"
    networks:
      - overlay
    deploy:
      mode: replicated
      replicas: 2
      endpoint_mode: vip

  mysql:
    image: mysql
    volumes:
       - db-data:/var/lib/mysql/data
    networks:
       - overlay
    deploy:
      mode: replicated
      replicas: 2
      endpoint_mode: dnsrr

volumes:
  db-data:

networks:
  overlay:

これを以下の様に変更して利用したい。

--- 03config/output.txt  2020-02-19 20:07:00.000000000 +0900
+++ 04overwrite-config/output.txt 2020-02-19 20:07:05.000000000 +0900
@@ -3,7 +3,7 @@
   wordpress:
     image: wordpress
     ports:
-    - '8080:80'
+    - '9090:80'
     networks:
     - overlay
     deploy:
@@ -18,7 +18,7 @@
     - overlay
     deploy:
       mode: replicated
-      replicas: 2
+      replicas: 5
       endpoint_mode: dnsrr
 volumes:
   db-data: null

goで読み込めるようにjsonに変換しておく(別にgo-yamlなどを使っても良い)

$ dictknife cat -o json docker-compose.yml > config.json

ふつうの読み込み

コードの一部分だけを抜粋という形式は好きではないのでまじめにふつうの読み込み部分のコードも書く。

変換しておいたJSONを以下のサービスにテキトーに貼り付けてgoの構造体のコードを得る。

conf/config.go

package conf

// https://docs.docker.com/compose/compose-file/
// https://mholt.github.io/json-to-go/

type Config struct {
    Version  string   `json:"version"`
    Services Services `json:"services"`
    Volumes  Volumes  `json:"volumes"`
    Networks Networks `json:"networks"`
}
type Deploy struct {
    Mode         string `json:"mode"`
    Replicas     int    `json:"replicas"`
    EndpointMode string `json:"endpoint_mode"`
}
type Wordpress struct {
    Image    string   `json:"image"`
    Ports    []string `json:"ports"`
    Networks []string `json:"networks"`
    Deploy   Deploy   `json:"deploy"`
}
type Mysql struct {
    Image    string   `json:"image"`
    Volumes  []string `json:"volumes"`
    Networks []string `json:"networks"`
    Deploy   Deploy   `json:"deploy"`
}
type Services struct {
    Wordpress Wordpress `json:"wordpress"`
    Mysql     Mysql     `json:"mysql"`
}
type Volumes struct {
    DbData interface{} `json:"db-data"`
}
type Networks struct {
    Overlay interface{} `json:"overlay"`
}

読み込む関数は以下のようなもの。

// JSONLoadFile ...
func JSONLoadFile(filename string, c interface{}) error {
    f, err := os.Open(filename)
    if err != nil {
        return err
    }
    defer f.Close()
    decoder := json.NewDecoder(f)
    return decoder.Decode(c)
}

以下の様にして使う。

   var c conf.Config
    if err := JSONLoadFile(config, &c); err != nil {
        return err
    }

    pp.ColoringEnabled = false
    pp.Println(c)

上書きの確認

上書きの確認は以下の様にすることにする

  1. 00config/main.go は上書きを行わないコード
  2. 01config/main.go は上書きを行うコード (overwrite.jsonを利用して上書きする)
  3. 00と01の実行結果のdiffを取りその差分を見る
00config
├── main.go
└── output.txt
01overwrite-config
├── main.go
└── output.txt
config.json
overwrite.json
Makefile
conf
└── config.go

こんな感じで実行する想定

$ go run 00config/main.go --config config.json | tee 00config/output.txt
OVERWRITE_CONFIG=overwrite.json
$ go run 01overwrite-config/main.go --config config.json | tee 01overwrite-config/output.txt
$ diff -u 00config/output.txt 01overwrite-config/output.txt > 02.diff

ちなみにgo.modは以下の様な状態。

go.mod

module m

go 1.13

require (
    github.com/imdario/mergo v0.3.8 // indirect
    github.com/k0kubun/pp v3.0.1+incompatible // indirect
    github.com/mattn/go-colorable v0.1.4 // indirect
    github.com/spf13/pflag v1.0.5 // indirect
)

実際の上書きのコード

ようやく本題。

OVERWRITE_CONFIGという環境変数に値が入っていたらそれを使って上書きすることにする。

以下の様なファイルを渡す。

overwrite.json

{
  "services": {
    "wordpress": {
      "ports": [
        "9090:80"
      ]
    },
    "mysql": {
      "deploy": {
        "replicas": 5
      }
    }
  }
}

設定の読み込みは以下の様なコードに変わった。

// JSONLoadFile ... (再掲)
func JSONLoadFile(filename string, c interface{}) error {
    f, err := os.Open(filename)
    if err != nil {
        return err
    }
    defer f.Close()
    decoder := json.NewDecoder(f)
    return decoder.Decode(c)
}

// LoadConfig ...
func LoadConfig(filename string) (*conf.Config, error) {
    var c conf.Config
    if err := JSONLoadFile(filename, &c); err != nil {
        return nil, err
    }

    overwritefile := os.Getenv("OVERWRITE_CONFIG")
    if overwritefile == "" {
        return &c, nil
    }

    fmt.Fprintf(os.Stderr, "***** OVERWRITE CONFIG by %q *****\n", overwritefile)

    var c2 conf.Config
    if err := JSONLoadFile(overwritefile, &c2); err != nil {
        return &c, err
    }
    if err := mergo.Merge(&c2, &c); err != nil {
        return &c, err
    }
    return &c2, nil
}

以下の様な形で実行する。

$ go run 00config/main.go --config config.json | tee 00config/output.txt
OVERWRITE_CONFIG=overwrite.json
$ go run 01overwrite-config/main.go --config config.json | tee 01overwrite-config/output.txt
$ diff -u 00config/output.txt 01overwrite-config/output.txt > 02.diff || exit 0

diff。良さそう。

02.diff

--- 00config/output.txt  2020-02-19 20:01:01.000000000 +0900
+++ 01overwrite-config/output.txt 2020-02-19 20:01:01.000000000 +0900
@@ -1,10 +1,10 @@
-conf.Config{
+&conf.Config{
   Version:  "3.7",
   Services: conf.Services{
     Wordpress: conf.Wordpress{
       Image: "wordpress",
       Ports: []string{
-        "8080:80",
+        "9090:80",
       },
       Networks: []string{
         "overlay",
@@ -25,7 +25,7 @@
       },
       Deploy: conf.Deploy{
         Mode:         "replicated",
-        Replicas:     2,
+        Replicas:     5,
         EndpointMode: "dnsrr",
       },
     },

gist

全体のコードは以下。ただしgistにあげる過程でディレクトリ構造が壊れてしまっているのでそのままでは実行できない。

(最近はpflagの方を使い始めている(cobraやviperはどうなんだろう?))

github.com

ちなみに

ちなみにpythonで似たようなことをやる場合には以下の様な感じ(schemaを定義していないので同じ状態ではない。zero valueが存在していないので同様のマージは難しい)。

import os
from handofcats import as_command
from dictknife import loading
from dictknife import deepmerge

# pip install dictknife[yaml] handofcats
# $ OVERWRITE_CONFIG=<overwrite file> python <file>.py --config <config file>

@as_command
def run(*, config: str) -> None:
    c = loading.loadfile(config)

    overwrite_file = os.environ.get("OVERWRITE_CONFIG")
    if overwrite_file is not None:
        c2 = loading.loadfile(overwrite_file)
        c = deepmerge(c, c2, method="replace")
    loading.dumpfile(c)

そして実はnestしたdictのマージはこれが良いというデフォルトが意外と決まらないという話。

ちなみに2

ちなみにpythonでhydraなどを使うと以下の様な形でコマンドライン引数として上書きする値を渡せたりする。

github.com

こういうかんじに。

$ python 06hydra/main.py
db:
  driver: mysql
  pass: secret
  user: omry

$ python 06hydra/main.py db.pass=oyoyo
db:
  driver: mysql
  pass: oyoyo
  user: omry

このときのコードは以下。

この辺の説明を読めば雰囲気は分かる。

(追記: viperも似たような機能を提供していたような記憶)

ちなみにのまとめ

  • (mergoを使うとそれなりに手軽に設定の上書きができる)
  • schemaを定義するか否かどちらが良いかは考える必要がありそう
  • コマンドライン引数での上書きをサポートするCLIも考えられる
  • (viperなどを使って環境変数経由での設定への以降(12FactorApp的な文脈))
  • (AWSのparameter storeなどを使う場合の対応)

(追記: validationの話がすっぽり抜け落ちているかも)