Tuesday, May 3, 2016

MongoDB dump/restore without temporary files

In MongoDB 3.2 new option --archive is available in mongodump and mongorestore which allow to to copy databases without temporary directories:

mongodump --host=localhost:37017 --db=userstate_10b --archive | mongorestore --host=localhost:27017 --drop --archive
MongoDB 3.2 also has a new --gzip option which in this case doesn't make sense to use, because mongodump and mongorestore are running on the same machine.

Saturday, April 9, 2016

Fixing HTTP latencies for `.local` hosts in /etc/hosts in Mac OSX 10.11.4 (El Capitan)

I have a problem on my Mac.

So I have a web-server on my localhost under `app.local` virtual server in Nginx config. When I make requests to it I notice that it takes cca. 5 seconds per requests. Looks pretty much as a timeout for DNS resolution.

The problem is known and there are several solution in the Internet. But only this helped me:

`etc/hosts`:

127.0.0.1 app.local
::1 app.local

So this trick is to specify IPv6 address too. And put hosts one per line -- hosts after the first one are ignored (a bug?).

Tuesday, March 22, 2016

Environment variables injecter for Jenkins

Based on my previous post about injecting environment variables in Jenkins to be used between steps and in post-build steps.

I created a groovy script which injects passed environment variables and even treats them as shell comands.

inject_env_vars.groovy
// Inject environment variables using Groovy

import hudson.model.*
import hudson.AbortException
import groovy.json.StringEscapeUtils


def build = Thread.currentThread().executable


def injectEnvVars(envVars) {
    for (item in envVars) {
        build.addAction(
            new ParametersAction([
                new StringParameterValue(item.key, item.value)
            ])
        )
    }
}

def bash(cmd, env) {

    cmd = cmd as String

    // create a process for the shell
    pb = new ProcessBuilder(["bash", "-c", cmd])
    // make job workspace directory as the current one
    pb.directory(new File(env['WORKSPACE']))
    pb.environment().putAll(env)
    // capture messages sent to stderr
    pb.redirectErrorStream(true)
    shell = pb.start()
    shell.getOutputStream().close()
    // capture the output from the command
    def shellIn = shell.getInputStream()

    def reader = new BufferedReader(new InputStreamReader(shellIn))
    def builder = new StringBuilder()
    while ( (line = reader.readLine()) != null ) {
       builder.append(line)
       builder.append(System.getProperty("line.separator"))
    }
    result = builder.toString()

    // wait for the shell to finish and get the return code
    def exitStatus = shell.waitFor()

    try {
        shellIn.close();
    } catch (IOException ignoreMe) {}

    if (exitStatus) {
        throw new AbortException(result)
    }
    return result
}

def readPropsFile(filePath) {
    // https://en.wikipedia.org/wiki/.properties
    def props = new Properties()
    new File(filePath).withInputStream {
        stream -> props.load(stream)
    }
    return props
}

def readEnvSh(filePath) {
    vars = [:]
    file = new File(filePath)
    file.eachLine { line ->
      def matcher = (line =~ '^(.*)=(.*)$')
      if (matcher) {
        def name = matcher.group(1)
        def value = matcher.group(2)
        if (value.startsWith('"')) { value = value[1..-2] }
        vars[name] = value
      }
    }
    return vars
}

// add some useful environment variables
env = build.getEnvironment(listener)
if (!env.containsKey('BUILD_USER')) {
    def userCause = build.getCause(hudson.model.Cause$UserIdCause)
    def userName = userCause?.userId ?: 'Jenkins'
    injectEnvVars([
        'BUILD_USER': userName,
        'JOB_DIR': "${env.WORKSPACE}/../../jobs/${env.JOB_NAME}",
        'BUILD_DIR': "${env.WORKSPACE}/../../jobs/${env.JOB_NAME}/builds/${env.BUILD_ID}",
    ])
}

// inject build result string: http://javadoc.jenkins-ci.org/hudson/model/Result.html
injectEnvVars(['BUILD_RESULT': "${build.result}"])

for (item in binding.variables.clone()) {
    def varName = item.key
    def varValue = item.value
    if (!(varValue instanceof String)) {
        // skip values injected by Jenkins Groovy plugin
        continue
    }
    if (varName == '_') {
        // run the given script code
        new GroovyShell(
            new Binding([
                'env': build.getEnvironment(listener),
                 'injectEnvVars': this.&injectEnvVars,
                 'readEnvSh': this.&readEnvSh,
            ])
        ).evaluate(varValue)
    } else {
        varValue = StringEscapeUtils.escapeJava(varValue)
        varValue = bash("echo \"${varValue}\"", build.getEnvironment(listener))
        injectEnvVars(["${varName}": varValue])
    }
}

Usage examples:

hipchat_message='${BUILD_USER} <a href="$BUILD_URL">started deploying backend</a> to <b>${project} ${target}</b>'
evaluate(new File("/home/jenkins/workspace/devops/inject_env_vars.groovy"))

hipchat_message='$(/home/jenkins/workspace/devops/deploy/get_deploy_info.py)'
evaluate(new File("/home/jenkins/workspace/devops/inject_env_vars.groovy"))


// variable `_` is considered to contain script code inside `inject_env_vars.groovy`
_='''
_env = readEnvSh("${env.WORKSPACE}/env.sh")

if (env['BUILD_RESULT'] == 'SUCCESS') {
    // single quoted values will be expanded when passed to injectEnvVars
    _env['hipchat_message'] = 'Server build succeded: <a href="https://$SERVER_NAME/">$SERVER_NAME</a> (<a href="$BUILD_URL">Job</a>)'
} else {
    ...
}

injectEnvVars(_env)
'''
evaluate(new File("/home/jenkins/workspace/devops/inject_env_vars.groovy"))

These should be run as System Groovy script.

I hope this could be someday implemented as a plugin.

Wednesday, January 20, 2016

The Clean Architecture in Python

Even design-conscious programmers find large applications difficult to maintain. Come learn about how the recently propounded “Clean Architecture” applies in Python, and how this high-level design pattern fits particularly well with the features of the Python language and answers questions that experienced programmers have been asking.
The Clean Architecture in Python, Brandon Rhodes, PyOhio 2014

Tuesday, December 1, 2015

PyMongo query logging

I needed to see what queries are sent to the server in my MongoEngine application.

I made one logger here: https://github.com/warvariuc/python-mongo-logger based on https://gist.github.com/kesor/1589672

Recently I improved the solution, but did not commit to the repo, because I put this code directly into the project instead of having git+git://github.com/warvariuc/python-mongo-logger
 in requirements.pip file:

from __future__ import absolute_import, print_function, unicode_literals, division

import logging
import time
import traceback
import inspect

from pymongo.mongo_client import MongoClient
from bson import json_util


logger = logging.getLogger('mongologger')


def activate(until_modules=('pymongo', 'mongoengine'), stack_size=3):
    """Activate Mongo-Logger.
    Args:
        until_modules (list): list of top level module names until which the stack should be shown;
          pass an empty sequence to show the whole stack
        stack_size (int): how many frames before any of `modules` was entered to show; pass
          -1 to show the whole stack or 0 to show no stack
    """
    # monkey-patch methods to record messages
    MongoClient._send_message_with_response = _instrument(
        MongoClient._send_message_with_response, until_modules, stack_size)
    return logger


def _instrument(original_method, until_modules, stack_size):
    """Monkey-patch the given pymongo function which sends queries to MongoDB.
    """
    def _send_message_with_response(client, operation, read_preference=None,
                                    exhaust=False, address=None):
        start_time = time.time()
        result = original_method(client, operation, read_preference=None,
                                    exhaust=False, address=None)
        duration = time.time() - start_time
        try:
            stack = ('\n' + ''.join(get_stack(until_modules, stack_size))).rstrip()
            logger.info('%.3f %s %s %s%s', duration, operation.name, operation.ns,
                        json_util.dumps(operation.spec, ensure_ascii=False), stack)
        except Exception as exc:
            logger.info('%.3f *** Failed to log the query *** %s', duration, exc)
        return result

    return _send_message_with_response


def get_stack(until_modules, stack_size):
    """
    """
    if not stack_size:
        return []
    frames = inspect.stack()[2:]
    frame_index = None
    for i, (frame, _, _, _, _, _) in enumerate(frames):
        module_name, _, _ = frame.f_globals['__name__'].partition('.')
        if module_name in until_modules:
            frame_index = i
        elif frame_index is not None:
            # found first frame before the needed module frame was entered
            break

    if frame_index is not None:
        del frames[:frame_index + 1]
        if stack_size >= 0:
            del frames[stack_size:]

    stack = [(filename, lineno, name, lines[0])
             for frame, filename, lineno, name, lines, _ in frames]

    return traceback.format_list(stack)

You activate the logger in the logging configuration like this:

import mongologger
LOGGING['loggers'][mongologger.activate(stack_size=0).name] = {
    'level': 'INFO',
    'handlers': ['stdout'],
    'propagate': False,
}


Saturday, November 28, 2015

Export environment variables for post-build steps in Jenkins

I needed a way to pass environment variables to post-build steps in Jenkins. HipChat notification plugin allows to use environment variables in its templates, but I cannot use commands there, so the template flexibility is quite limited.

So I wanted to make the message text in a build step and then pass it to the post-build HipChat notifications step.

There is EnvInject Plugin to accomplish this, but I found it to be not user-friendly -- I had to use two steps: 1) echo values to a properties file; 2) then use EnvInject to read the values from the file and export them.

I found a solution that suits me better. In build steps I only create/update the properties file:

echo "hipchat_message=Server build succeded: <a href='https://$SERVER_NAME/'>$SERVER_NAME</a> (<a href='$BUILD_URL'>Job</a>)" > "$WORKSPACE/postbuild.props"

and then in the first post-build step use this Groovy script:

/*
Inject environment variables using Groovy because EnvInject plugin is not user-friendly
*/

import hudson.model.*

def console = manager.listener.logger.&println

// read the props file
def props = new Properties()
new File("${manager.envVars['WORKSPACE']}/postbuild.props").withInputStream { 
    stream -> props.load(stream) 
}

props.each{
    key, value -> console("${key}:${value}")
    def pa = new ParametersAction([
        new StringParameterValue(key, value)
    ])
    manager.build.addAction(pa)
} 

Friday, November 20, 2015

Cache and SSL in Chrome on Ubuntu Linux

I had a hard time testing if cache headers I've added to static files are working. Firefox was working as supposed, while Chrome wasn't.

Here is the Nginx configuration I was testing:

    # static files
    location ~ ^/(***)?$ {
        root ***;
        try_files $uri /;
        add_header Cache-Control "max-age=3600";
        gzip on;
        #etag on;
        access_log off;
    }

I was trying different combinations to find out why sometimes the files where taken from cache (looking to developer's console in Chrome) and why other times Chrome was always making requests with response 200.

My tests showed that my Chrome was using cache if both Cache-Control and ETag headers where present. The problem is that if `etag on` directive is enabled together with `gzip on`, etags are not generated: https://trac.nginx.org/nginx/ticket/377

When I disable gzip and enable etag, cache works ok.

But my colleagues say that in their browsers, including Chrome on OSX cache works as expected.

It turned out that the server I was testing on had a self signed certificate, and Chrome has issues with caching in such cases: https://code.google.com/p/chromium/issues/detail?id=110649

I tried to export the certificate and import it back as a trusted certificate, but it didn't work. They say:

Google Chrome in Linux doesn’t have a SSL certificate manager, it relies on the NSS Shared DB.


Now cache works as expected in my Chrome on Ubuntu Linux.