Sunday, October 30, 2011

Russian Phonetic Keyboard Qwerty (Яверты) for Windows

After installing Windows 7 I was looking for a Russian phonetic keyboard (the one I used for XP from kovrik.com stopped working for me) but could not find a good one.

There is a standard Russian keyboard coming with a windows installation but then you need to put some stickers on your keyboard. I you are lazy to put stickers, you need a keyboard where Russian A corresponds to English A, П to P, Р to R, С to S, etc. Even though it is quite a challenge to map 33 Russian letters to 26 Latin ones, you can be pretty close.
This phonetic keyboard is called ЯВЕРТЫ (5 Russian letters on the top left corner).

Thanks to the good guys from Microsoft, now it is easy to create a new keyboard layout for Windows. It should work for Windows 7 and XP, as well as Windows server editions.

Here is the layout I created:

Here how it looks when you press a Shift key:

And here how it is when pressing either Ctrl or Alt Gr + Ctrl keys:

Installation Instructions.
That's it. A new Russian text layout called Russian Яверты will be installed and visible in the Installed Programs. You can also uninstall it from there.

On the task bar in the bottom right corner you should the language bar (press Alt-Shift to switch between languages and Ctrl-Shift to switch between keyboard layouts). Choose  Russian Яверты layout:

If you don't see this language bar, go to Control Panel -> Clock, Language, and Region -> Change Keyboards and add the corresponding keyboard and layout in the General tab.

There is a bug on Windows XP - the layout is installed but you do not see it when you click on the keyboard icon as on the Windows 7 screenshot above. To get around this, do the following. Open the keyboard settings (Control Panel -> Regional and Language Options -> Languages -> Details -> Settings) and add the new Russian Яверты layout by clicking on "Add..." and choosing Russian Яверты from a list of keyboard layouts for Russian. You will get something like this:


Then clik on "Key Settings.." and assign a key sequence to the new layout, e.g. Ctrl+Shift+0:

Click OK and save the settings.
Now to enable the new Russian layout just use the assigned sequence (e.g. Ctrl+Shift+0).
Strangely enough, sometimes you need to press the key sequence (Ctrl+Shift+0) twice to enable the layout.
Also the new layout won't be visible on the taskbar. Time to migrate to Windows 7?

Friday, February 25, 2011

Microsoft ZDC being shit down

Unfortunately our beautiful office is being shut down.
Looks like we are too expensive :(
Here are a few pics from our office facilities


Cafeteria

Air Hockey and Darts

Climbing Wall




Foosball


Game Room

Pool

Table Tennis
XBox

Tuesday, January 18, 2011

Windows Phone 7 vs 6.5: One Step Forward, Two Steps Back

Microsoft has given Windows Phone 7 to all of its employees. Before I got Phone 6.5 also from Microsoft. So now I can compare both.
 
Windows Phone Features / Version
6.5
7
Multi-touch
Browsing experience is much better for 7


Radio FM
Really sucks in 7. No memorized stations,  possibility to switch between headphones and a speaker.


Apps and App Store
Nothing useful as of today for 7.
Phone 7 follows a bad for the customer tradition of requiring a developer to spend 100 bucks in order to upload an app. With 6.5 you could install any 6.5 CAB file you found on the Web. There are quite a few free apps whose creators won’t pay these 100 bucks


Copy-Paste
Currently missing in 7


Uploading Photos with One Click
The cool stuff with Phone 7 is that you can upload a Photo to your 25GB free Sky-Dive account  with just one click


Support for writing in non-ASCII
E.g. if you want to write in Russian, or even include a non-ASCII word  forget it (on top of you do not have copy-paste which could’ve been a work around)


Editing text you type
You have to delete the whole word or sentence – e.g. I could not figure out putting a cursor in the middle of the word


Battery Indicator
Hard to beleive but Phone 7 won't tell you exactly how much battery life is left. Be prepared for unexpected shutdowns


Multi-tasking
Missing in 7. It also does not work on Phone 6.5 good (the apps are sometimes just shut down – looks like a regression from Windows Phone 6, where it worked better)


Having access to most of the stuff on the main page
You can set up Phone 6.5 so that on the main page you can see weather, stocks, top news, turn on and off with one click the phone, WI-FI, Bluetooth, ring, vibrate…. And 10 most useful apps on the same screen on top of that!



Last but not least, if you are using Windows Phone 6.5, install my free Gmail client: NanoGmail

As Wayne Gretzky once said, "I skate to where the puck is going to be, not where it has been." Unfortunately, with Windows Phone 7, Microsoft is starting to play where the puck was long time ago and ice is already melting.

Monday, January 10, 2011

Trimming WMA File Duration

A part of dial-in conferencing services that we just shipped as a part Microsoft Lync 2010 has to do with recording a name for an anonymous user when joining a conference and then playing it during the conference (e.g. "Mike Johnston has joined the conference").

The problem here is that sometimes, depending on a Gateway, there is an ending beep (recorded from the '#' sign that a user presses when she is done with recording). It is difficult to predict in advance how many milliseconds should be cut from the user name recorded file (since in some systems there is no trailing beep at all). By default we cut 500 milliseconds and let the administrator the possibility to configure a different value if she wants to.

What is the best way for admin to figure out how many milliseconds should be trimmed from a file containing a beep? Easy - here is a tool that when run creates a serie of files with n milliseconds  trimmed from its end. Unfortunately it was too late to include this tool as a part of Lync Reskit Tool but this issue should be definitely fixed by the next Lync release.

Instructions:
  • Download the tool, unzip it, and place the executable to a directory, say C:\trim
  • Copy a WMA file containing the trailing beep to the same directory and name it target.wma
  • Run the tool trimfile.exe without any parameters
  • As a result 21 files reslt100.wma, result150.wma, ..., result1000.wma will be created and placed to the same directory. The number indicates (approximate) number of milliseconds cut from the file.
  • Listen to the resulting files to figure out how many milliseconds should be cut from the file.
The tool was created using Windows Media C++ libraries. If you want to play around with it, take a look at main.cppWMVCopy.h and WMVCopy.cpp

Friday, December 31, 2010

Merry Xmas & Happy New Year 2011!

Merry Xmas & Happy New Year 2011!


I am as a Russian "Father Frost". From Russia with Love.

Thursday, December 23, 2010

Microsoft Lync 2010 Resource Kit Tools

Microsoft has started publishing Lync 2010 Resource Kit articles online. First article: Call Park-o-Meter has been written by me. It is nice to be first:)

It is about a tool that I have written to faciltate usage of the Lync Call Park Service, that I have been developing in Microsoft Zurich Development Center. The difference between a service which is a part of Microsoft Lync Server and a Microsoft Lync Reskit tool is that the latter is not shipped together with the product but can be downloaded and installed separately.

It is unclear if a hard copy will be published this time as it was for OCS 2007 R2 Reskit book.
I published a sidebar there for OCS RGS (Response Group Service). RGS is another voice application that is also being developed in Zurich Development Center.

Tuesday, December 21, 2010

A C++ Implementation of a Mathematical Expression Parser

The topic of a parser for mathematical expressions is one of the most basic in Computer Science and has been implemented probably in thousands of computer programs. Nevertheless, after some googling, I wasn’t able to find a simple code in object oriented C++. What I found was either code that was written in an old C-style procedural language (called C++ because of some new C++ constructs), kind of “write once, read never”, or the code was hidden, requiring users to pay hundreds of dollars to read the source.

The purpose of this article is to present a simple code for parsing mathematical expressions (in total it has just over 300 lines of code, including both implementation classes and a command line client) in an easy to read C++. The presented implementation has a slight modification of the general parsing algorithm. The evaluation stack contains structures, each of which contains a double value and a corresponding sign, found right after the value. The computation is done via creating a number of stacks and combining the above mentioned structures. The parsing string is scanned only once during processing. Another advantage of this implementation is the decoupling of the parser and function definitions, which can be easily added on the fly. The function definition data structure consists of a map from a string token to the pointer of the function implementing this token functionality.

Algorithm description

The parser has just two classes – EZParser, which contains all the parsing logic, and EZParserFunctions which contains function definitions (e.g. exp, sin, cos, pi, etc.). If there is a need to include a new function or functionality (e.g. atan, or differentiation), either EZParserFunctions class has to be augmented or a new class has to be derived from EZParserFunctions and passed to EZParser during initialization (the corresponding virtual methods of EZParserFunctions class should be overridden then).

The mathematical expression string can have the following items, ordered by priority:

  1. Parenthesis.
  2. Functions from EZParserFunctions class (exp, sin, cos, etc.).
  3. Power sign ‘^’.
  4. Multiplication ‘*’ and division ‘/’ signs.
  5. Addition ‘+’ and subtraction ‘-‘signs.
 The algorithm is O(n), where n is the string length, and the string is scanned only once.
Note that the parser (EZParser::load() function) expects the string without any white characters and with the same number of opening and closing parenthesis, so if the string may contain whites/uneven parenthesis, another scann through the string is needed (implemented in EZParser::preprocess function).

The string is scanned only once during processing because of the “size_t from” in and out function argument. It points to the current character in the parsing stringand it is incremented after each character is read from the parsing string.

This is the main parsing class definition. Note that only standard C++ header files are needed in addition to a function definition class (we’ll talk about it later):


#include <vector>
#include <map>
#include <cmath>
#include <cstdio>
#include <iostream>

using namespace std;

class EZParserFunctions;

class EZParser
{
  public:
    class  CalculationException : public exception
    {
          public:
            CalculationException(string err) : exception(), m_what(err) {}
            ~CalculationException() throw() {};
            const char* what() const throw() { return m_what.c_str(); }
          private:
            string m_what;
    };

    EZParser() { stack.reserve(16); } // an "intelligent" guess
    double calculate(const string& data);
    double load(const string& data, size_t& from, char c = '\n');

    static void preprocess(const string& from, string& to);
    static EZParserFunctions* theFunctions;

  private:
    struct Cell
    {
          Cell(double v, char ch) { value = v; sign = ch; }
          double value;
          char sign;
    };

    double calculate();
    void calculateNext(Cell& base, Cell& current, size_t& index);

    static char updateSign(const string& item, size_t& from, char ch);
    static bool validSign(char ch);

    static bool stillCollecting(const string& item, char ch);
    static void mergeCells(Cell& base, Cell& next);

    vector<Cell> stack;
};

The main entry point is the following function:

double EZParser::calculate(const string& data)
{
        string cleanData;
        EZParser::preprocess(data, cleanData);
        size_t iter = 0;
        return load(cleanData, iter);
}
which in turn calls the following main processing function:

double EZParser::load(const string& data, size_t& from, char end)
{
        if (from >= data.size() || data[from] == end)
        {
                throw CalculationException("Loaded invalid data " + data);
        }

        stack.clear();
        string item;

        do {  // main processing cycle
                char ch = data[from++];
                if (stillCollecting(item, ch))
                { // the char still  belongs to the previous operand
                        item += ch;
                        if (from < data.size() && data[from] != end)
                                continue;
                }
                double value = theFunctions->getValue(data, from, item, ch);
                char sign = (end != '\n' && validSign(ch)) ? ch : updateSign(data, from, ch);
                stack.push_back(Cell(value, sign));
                item.clear();
        }       while(from < data.size() && data[from] != end);

        if (from < data.size() && data[from] == ')')
        { // end of parenthesis: move one char forward
            from++;
        }

        return calculate();
}
Let’s see how it works via an example.
Suppose we have the following mathematical expression for evaluation:

exp(1+(15-2*2^3))

i.e. the load() function is called with the following arguments:

load(“exp(1+(15-2*2^3))”, 0, ‘\n’);

Inside of the load function we start parsing this string character till we find the end of token condition, which is the following (“item” is the newly created token, “ch” is the current character, i.e. we start with item = “”, ch = ‘e’, then, on the next loop, item = “e”, ch = ‘x’, then item = “ex”, ch = ‘p’, and so on):

bool EZParser::stillCollecting(const string& item, char ch)
{
        return (item.size() == 0 && (ch == '-' || ch == ')')) ||
          (ch != '*' && ch != '/' && ch != '+' && ch != '-' && ch != '^' && ch != '(' && ch != ')');
}

In other words we stop creating a new token as soon as we find parenthesis or an arithmetical operation. EZParser::stillCollecting will return false when item = “exp” and ch = ‘(‘.

Next we check if the new token is one of the predefined functions.
All the functions are defined in a different class, EZParserFunctions:

class EZParserFunctions
{
        public:
                typedef double (EZParserFunctions::*ptrFunc)(const string& arg, size_t& from);
                EZParserFunctions();

                virtual double getValue(const string& data, size_t& from, const string& item, char ch);

                double exp(const string& arg, size_t& from);
                double abs(const string& arg, size_t& from);
                double log(const string& arg, size_t& from);
                double avg(const string& arg, size_t& from);
                double sin(const string& arg, size_t& from);
                double cos(const string& arg, size_t& from);
                double tan(const string& arg, size_t& from);
                double sqrt(const string& arg, size_t& from);
                double round(const string& arg, size_t& from);
                double identity(const string& arg, size_t& from);
                double pi(const string& arg, size_t& from);

            static double strtod(const string& str);

        protected:
                map<string, ptrFunc> functions;
};

Here are some of the implementations:

EZParserFunctions::EZParserFunctions()
{
        functions["exp"]   = EZParserFunctions::exp;
        functions["log"]   = EZParserFunctions::log;
        functions["sin"]   = EZParserFunctions::sin;
        functions["cos"]   = EZParserFunctions::cos;
        functions["sqrt"]  = EZParserFunctions::sqrt;
        functions["abs"]   = EZParserFunctions::abs;
        functions["round"] = EZParserFunctions::round;
        functions["pi"]    = EZParserFunctions::pi;
}

double EZParserFunctions::identity(const string& arg, size_t& from)
{
        EZParser sub;
        return sub.load(arg, from, ')');
}
double EZParserFunctions::pi(const string& arg, size_t& from)
{
        return 3.141592653589793;
}
double EZParserFunctions::exp(const string& arg, size_t& from)
{
        EZParser sub;
        return ::exp(sub.load(arg, from, ')'));
}
The main advantage of this class is that it has a map of string tokens to a pointer to an actual function implementation. Note that a map from a string to a function pointer is quite efficient when the number of elements is small (besides of another great advantage of being a standard C++ data structure).

Returning back our example, we are in EZParser::load function, in the call on line 15:

    double value = theFunctions->getValue(“exp(1+(15-2*2^3))”, 4, “exp”, ‘(‘);
The implementation of the called virtual function is the following:

double EZParserFunctions::getValue(const string& data, size_t& from, const string& item, char ch)
{
        if (item.empty() && ch == '(')
        {
                return identity(data, from);
        }
        map<string, ptrFunc>::const_iterator it = functions.find(item);
        return it == functions.end() ? strtod(item) : (this->*(it->second))(data, from);
}
This function will find EZParserFunctions::exp function in the functions map, i.e. we will return the following call from the above function:

    return this->exp((“exp(1+(15-2*2^3))”, 4);  // last line of EZParserFunctions::getValue

Note that if we did not have a definition of the exp function in EZParserFunctions file, the call to EZParserFunctions::strtod(“exp”) would have been made, which would’ve thrown an exception:

double EZParserFunctions::strtod(const string& str)
{
    char* x = new char[str.size() + 1];
    double num = ::strtod(str.c_str(), &x);
    if (::strlen(x) > 0)
    {
          throw EZParser::CalculationException("Could not parse token [" + str + "]");
    }
    return num;
}
But in our case, inside of EZParserFunctions::exp() function, we will first need to load the argument for the exponent. This will be done via defining a new EZParser object and calling its load method with the following arguments:

    sub.load(“exp(1+(15-2*2^3))”, 4, ')');  // line 2 of EZParserFunctions::exp() function

i.e. load() function will start parsing the string from the 4th element (starting counting from 0), i.e. the working substring will be “1+(15-2*2^3))”.

Inside of the load() function we will get the first token as “1”, and the sign as ‘+’. So the following Cell object will be created and pushed to the stack:

    double value = strtod(“1”); // = 1; line 15 of EZParser::load() function
    stack.push_back(Cell(1, ‘+’)); // line 17 of EZParser::load() function
Let’s take a brief look at line 16 of the load function. What it does is the following: if the current character after the token is a valid sign (as it is in our case, since the sign ‘+’ follows immediately number 1) it just takes it. But if it were not a valid sign (e.g. consider expression “(1+2)+3”: after parsing the second token “2” the sign is ‘)’ which is an invalid sign, so the updateSign function would have been called which would’ve examined the string till it found a valid sign (‘+’ in our case)). Note that the last token in an expression always has an invalid sign, but as we will see later this is OK. Here are the definitions of the functions for finding a token’s sign:

char EZParser::updateSign(const string& item, size_t& from, char ch)
{
        if (from >= item.size() || item[from] == ')')
        {
                return ')';
        }
        size_t tmp = from;
        char res = ch;
        while (!validSign(res) && tmp < item.size())
        {
                res = item[tmp++];
        }

        from = validSign(res) ? tmp : tmp > from ? tmp - 1 : from;
        return res;
}

bool EZParser::validSign(char ch)
{
        return ch == '*' || ch == '/' || ch == '+' || ch == '-' || ch == '^';
}
Now let’s continue parsing our string in the EZParser::load() function. Since the condition on line 19 is true, we will continue the cycle and after line 8 we will have ch = ‘(‘ and from =  7. So stillCollecting(“”, ‘(‘) function on line 9 will return false.
Then the call to

    theFunctions->getValue(“exp(1+(15-2*2^3))”, 7, “”, ‘(‘) // line 17 of EZParser::load():
will return a special case, an “identity” function (line 3 of EZParserFunctions::getValue function). This function is used to process and change in place everything that is between parentheses (since they have the highest precedence).
Inside of EZParserFunctions::identity function we create a new EZParser for our substring and load it with the following call:

    sub.load(“exp(1+(15-2*2^3))””, 7, ')');
Basically that means that we are parsing “15-2*2^3))” substring. This substring now does not have any subfunctions so we’ll get a stack consisting of four Cell objects constructed via subsequent calls to the lines 20 and 21of the EZParser::load() function. The stack will contain these four elements:
Cell (15, ‘-‘), Cell(2, ‘*’), Cell(2, ‘^’), Cell(3, ‘)’).
Then the EZParser::calculate() function will be called from the line 26 of EZParser::load():

double EZParser::calculate()
{
        if (stack.size() == 0)
        {
                throw EZParser::CalculationException("Empty stack for calculation");
        }
        Cell base = stack.at(0);
        size_t index = 1;

        while(index < stack.size())
        {
                Cell next = stack.at(index++);
                if (!(base.sign == '^' || ((base.sign == '*' || base.sign == '/') && next.sign != '^') ||
                        next.sign == '+' || next.sign == '-' || next.sign == ')'))
                {
                        calculateNext(base, next, index);
                }

                mergeCells(base, next);
        }
        return base.value;
}
The main logic of this function is on lines 10-11. If we cannot perform a mathematical operation right away, i.e. if the second sign has a higher precedence than the current one, (as it is in our case when working with the first two cells, Cell (15, ‘-‘), Cell(2, ‘*’), where ‘*’ has a higher precedence as ‘-‘), we need to move on and try to merge the next two cells using the following function:

void EZParser::calculateNext(Cell& base, Cell& current, size_t& index)
{
        if (index >= stack.size())
        {
                return;
        }

        size_t tmp = index;
        Cell next = stack.at(index++);

        if (!(base.sign == '^' || ((base.sign == '*' || base.sign == '/') && next.sign != '^') ||
                next.sign == '+' || next.sign == '-' || next.sign == ')'))
        {
                calculateNext(current, next, index);
        }
        mergeCells(current, next);
}
So we won’t be able to merge cells when working with cells Cell (15, ‘-‘) and Cell(2, ‘*’) and with Cell(2, ‘*’) and  Cell(2, ‘^’). First time the function mergeCells will be called is when merging the last two cells: Cell(2, ‘^’), Cell(3, ‘)’). The mergeCells function is doing actual computation:

void EZParser::mergeCells(Cell& base, Cell& next)
{
        switch(base.sign)
        {
                case '^' : base.value = ::pow(base.value, next.value);
                        break;
                case '*' : base.value *= next.value;
                        break;
                case '/' :
                        if (next.value == 0)
                        {
                                throw CalculationException("Division by zero");
                        }
                        base.value /= next.value;
                        break;
                case '+' : base.value += next.value;
                        break;
                case '-' : base.value -= next.value;
                        break;
        }
        base.sign = next.sign;
}
The result of the calculation will be saved in the “base” cell – note that the “sign” of the base cell is updated after merging cells.
Basically that’s it. Now we will be “unwinding” the stack: calling mergeCells() with Cell(2, ‘^’) and Cell(3, ‘)’) will return the base Cell(8, ‘)’), then calling mergeCells() with Cell(2, ‘*’) and Cell(8, ‘)’) will return the base Cell(16, ‘)’), and finally calling mergeCells() with Cell (15, ‘-‘) and Cell(16, ‘)’) will return the base Cell(-1, ‘)’).

So our call to sub.load(“exp(1+(15-2*2^3))””, 7, ')'); from the EZParserFunctions::identity function will return -1.
Now we have just two cells in our stack: remember than we already have Cell(1, ‘+’) on the stack which combined with our resulting Cell(-1, ‘)’) will give us 1 -1 = 0.
That means that we found our argument for the exp function (line 2 of EZParserFunctions::exp function). So the final result will be ::exp(0) = 1.

Finally here is a simple main function to test the parser (the parsing string is the last argument to this function when calling it from the command line).

int main(int argc, char* argv[])
{
    string str = argv[argc - 1];
    try
    {
            EZParser ts;
                double res = ts.calculate(str);
                cout << "Got result: " << str << " -> " << res << endl;
        }
        catch(EZParser::CalculationException& e)
        {
                cout << "Calculation Exception for " << str << ": " << e.what() << endl;
        }
        catch(std::exception& e)
        {
                cout << "Standard Exception for " << str << ": " << e.what() << endl;
        }
        return 0;
}

The whole file is located here and is tested using a GNU compiler on Linux.

 

Conclusions


There are several things that should be considered in the above presented algorithm.
  • After getting a token first we checked that the token corresponds to a function name and if not we check that the token can be parsed to a number. Maybe the order should be switched if it is known in advance that the string contains mostly numbers.
  • There is a separate calculation of opened and closed brackets and removal of blanks from the passed string (function EZParser::preprocess). These operations can be combined inside of the EZParser::load() procedure for more efficient processing. But the disadvantage of this is that there will be a slight performance loss for strings that are known to be “OK” and don’t require pre-processing.
  • Currently adding a new functionality (e.g. new functions, constants, differentiation/integration, etc.) requires either supplying a new class derived from EZParserFunctions and overriding getValue() function or modifying EZParserFunctions: adding a new function definition and a new function mapping. This part could be made easier by supplying a new convenience function or idiom.