@dirceu

GSoC Summary

As the Google Summer of Code coding period comes to an end, I must write about what I have done, so here we go. My proposed tasks were:

The first two weeks were used on this task and on getting myself familiar with gocept.zeoraid codebase. As I said in my first report, I didn’t found anything wrong with the invalidation proccess when doing my tests, so I moved to another task.

Task 5

This was a reasonably easy one. My main problem was getting an automated test to prove the implementation. It’s hard to test a distributed system, and doing the setup is a little problematic; anyway, with help from my mentor, Christian Theune, I got a working test. My work is on this branch.

Task 4

This one was tricky. I had no “real world” experience on writing threads, so I stepped into many obvious bugs - joining threads, dealing with exception handling and such things. I’ve spent almost four weeks working on this task, but in the end I got it right. My work is on this branch.

Task 3

I started researching a bit about this task, but Christian said that task 2 was much more important, so I stopped working on it.

Task 2

My final task over the GSoC period. I started by writing a “add” command to the ZEORaid command-line tool, as I described in this report. It was easy to get this working for ClientStorage objects, but I needed to get it working with other types of storages too; some time later Christian said that it was only a “helper” for me to understand better the way that ZEORaid handles storages and openers.

I dropped this implementation and started working on reloading the entire config file. I got it working in a couple of days, but spent much more time than that working on an automated test for it. My work is on this branch.

GSoC Summary

GSoC Report - Week 12

Quick report: this week I’ve finished the ‘reloading’ task - now with automated tests, thanks to the help of Christian, my mentor. Now I need to document my branches and write a post about what I’ve done on the coding period. I need to do this on this weekend, because next monday is ‘pencils down’ date, when all mentors must submit a final evaluation of the work done by their students.

GSoC Report - Week 12

GSoC report - week 11

I’ve nearly finished the ‘reloading’ task described in my last report; now we’re able to reload the config file without modifications. The code is quite simple: we open and parse the config file, using the list of storages defined there to see what storages needs to be removed (and then we disable them) and what storages needs to be added (and then we add and ‘recover’ them). I’m now working on a automated test to prove the implementation is right.

GSoC report - week 11

GSoC report - week 10

Just a quick report: this week I worked on reloading the zeo.conf file, so we can add or remove storages from the ZEORaid server without having to restart it. It seems to be working well, but it needs better (automated) testing and I need to use the correct ZODB.config options to recognize the config file without problems (this has been my main issue for the week). My work is on this branch.

GSoC report - week 10

GSoC report - week 9

This week I’ve been working on allowing to define and remove new backend storages while running using the management tool or reloading the config. I worked first on defining a new storage using the management tool. Using this branch, you can connect to a new ClientStorage using ‘zeoraidX-manage-main add IP:PORT:STORAGE_NAME’. Example:

Now I need to work on the following things:

GSoC report - week 9

GSoC report - week 8

I’ve been working on the task I described on my last post, about parallelizing request to multiple backends. After many time getting strange errors on the tests I got to isolate the problem - my code wasn’t handling the exceptions right.

ZEORaid considers ZODB.POSException.POSError and transaction.interfaces.TransactionError valid answers from storages, as they don’t indicate storage failure. The problem was that I was getting some ConflictErrors (a kind of TransactionError) that were being considered a sign of storage failure when running ZODB tests.

Christian helped me to work on hadling this and now all tests are passing. I’m just refactoring the code now and will procceed to the next task: allow defining/removing new backend storages while running using the client tool or reloading the config.

GSoC report - week 8

GSoC report - week 5

In the beginning of this week I finished the tests for my last task and done some refactoring on it’s code. My main problem were to get a storage implementation to use on my tests; first I wrote a subclass of ZODB.DemoStorage.DemoStorage, but it wasn’t implementing the full API needed by ZEORaid. Christian said that I should change the base class to ZODB.FileStorage.FileStorage - a pretty trivial change to do, and it solved my problems. After that I fixed some things in the test (that was getting possible false alarms).

Finishing that I begin my current task: parallelizing requests to multiple backends. Basically what ZEORaid currently does when it needs to send write requests to multiple backends is:

What we want it to do is:

Reading (briefly) the source code I’ve found 4 methods that loops the list of backends in the “optimal” state and make calls:

The first three methods could be changed to start len(self.storages_optimal) threads and make the work; then we could use join() to wait for all answers and proceed. About the 4th method, _apply_single_storage(), Christian said that “Distributing the read requests in parallel is a good idea too, but a bit less important than distributing the write requests. Serialized writes currently cause a lot of overhead and slowness and reading won’t benefit as much, so I’d leave apply_single_storage out of the story for now.”, so I will focus on parallelizing write requests first. He also said that should be some refactoring to have the looping/thread creation only once in the code.

This seems a bit more challenging than my last task (specially to test), which is great! I’m learning a lot about many things, and it’s been very fun :-).

GSoC report - week 5

GSoC report - week 4

I had some problems this week (many exams and things to do on university), so this was a slow week on GSoC.

I worked on creating tests for my last week’s patches. Unfortunately I took the wrong direction, writing some code to monkeypatch the gocept.zeoraid.storage.RAIDStorage.__apply_single_storage() method to provide logging and see if the requests are being distributed; Christian later suggested that it should be better to write a simple storage implementation to avoid using monkeypatching (it doesn’t need to be a ClientStorage compatible storage, it just need to implement some needed methods such as gocept.zeoraid.tests.failingstorage.FailingStorage does).

Next week I will work on finishing this task and hopefully starting my next task - parallelize requests to multiple backends, probably.

GSoC report - week 4

GSoC report - week 3

Martijn suggested that I should write periodic reports about my work on GSoC, so here we go.

My first task was to test if ZEORaid is processing validations from the backend storages too often. I’ve done some tests in different situations, and I haven’t found anything wrong yet. I have to put more work on this, but it will wait because there are more important things that need attention until the 1.0 release.

After that I’ve been working on distributing single requests to backends over the available (optimal) backends. Right now ZEORaid (trunk) iterates over a list of optimal storages and gets the first reliable result; I wrote a simple patch to distribute the requests randomly.

My current task is to provide supplemental tests to verify that the new changes actually apply as predicted (e.g. that reading actually does trigger requests to different backends).

GSoC report - week 3

GSoC Application Accepted!

As most of you already know, my student application was accepted for the Google Summer of Code project. I will be working on gocept.zeoraid, package that provides a proxy storage that works like a RAID controller by creating a redundant array of ZEO servers. ZEORaid was created by Christian Theune from gocept, who is my mentor for this project.

I’m using the time left until the coding starts to learn how to use ZEORaid and how it’s code works. I’m very happy and excited with this project, and I’m already learning a lot with it.

Many thanks to everyone who helped me, and good luck for all fellow GSoC students!

GSoC Application Accepted!