Tuesday, December 21, 2010

A C++ Implementation of a Mathematical Expression Parser

The topic of a parser for mathematical expressions is one of the most basic in Computer Science and has been implemented probably in thousands of computer programs. Nevertheless, after some googling, I wasn’t able to find a simple code in object oriented C++. What I found was either code that was written in an old C-style procedural language (called C++ because of some new C++ constructs), kind of “write once, read never”, or the code was hidden, requiring users to pay hundreds of dollars to read the source.

The purpose of this article is to present a simple code for parsing mathematical expressions (in total it has just over 300 lines of code, including both implementation classes and a command line client) in an easy to read C++. The presented implementation has a slight modification of the general parsing algorithm. The evaluation stack contains structures, each of which contains a double value and a corresponding sign, found right after the value. The computation is done via creating a number of stacks and combining the above mentioned structures. The parsing string is scanned only once during processing. Another advantage of this implementation is the decoupling of the parser and function definitions, which can be easily added on the fly. The function definition data structure consists of a map from a string token to the pointer of the function implementing this token functionality.

Algorithm description

The parser has just two classes – EZParser, which contains all the parsing logic, and EZParserFunctions which contains function definitions (e.g. exp, sin, cos, pi, etc.). If there is a need to include a new function or functionality (e.g. atan, or differentiation), either EZParserFunctions class has to be augmented or a new class has to be derived from EZParserFunctions and passed to EZParser during initialization (the corresponding virtual methods of EZParserFunctions class should be overridden then).

The mathematical expression string can have the following items, ordered by priority:

  1. Parenthesis.
  2. Functions from EZParserFunctions class (exp, sin, cos, etc.).
  3. Power sign ‘^’.
  4. Multiplication ‘*’ and division ‘/’ signs.
  5. Addition ‘+’ and subtraction ‘-‘signs.
 The algorithm is O(n), where n is the string length, and the string is scanned only once.
Note that the parser (EZParser::load() function) expects the string without any white characters and with the same number of opening and closing parenthesis, so if the string may contain whites/uneven parenthesis, another scann through the string is needed (implemented in EZParser::preprocess function).

The string is scanned only once during processing because of the “size_t from” in and out function argument. It points to the current character in the parsing stringand it is incremented after each character is read from the parsing string.

This is the main parsing class definition. Note that only standard C++ header files are needed in addition to a function definition class (we’ll talk about it later):


#include <vector>
#include <map>
#include <cmath>
#include <cstdio>
#include <iostream>

using namespace std;

class EZParserFunctions;

class EZParser
{
  public:
    class  CalculationException : public exception
    {
          public:
            CalculationException(string err) : exception(), m_what(err) {}
            ~CalculationException() throw() {};
            const char* what() const throw() { return m_what.c_str(); }
          private:
            string m_what;
    };

    EZParser() { stack.reserve(16); } // an "intelligent" guess
    double calculate(const string& data);
    double load(const string& data, size_t& from, char c = '\n');

    static void preprocess(const string& from, string& to);
    static EZParserFunctions* theFunctions;

  private:
    struct Cell
    {
          Cell(double v, char ch) { value = v; sign = ch; }
          double value;
          char sign;
    };

    double calculate();
    void calculateNext(Cell& base, Cell& current, size_t& index);

    static char updateSign(const string& item, size_t& from, char ch);
    static bool validSign(char ch);

    static bool stillCollecting(const string& item, char ch);
    static void mergeCells(Cell& base, Cell& next);

    vector<Cell> stack;
};

The main entry point is the following function:

double EZParser::calculate(const string& data)
{
        string cleanData;
        EZParser::preprocess(data, cleanData);
        size_t iter = 0;
        return load(cleanData, iter);
}
which in turn calls the following main processing function:

double EZParser::load(const string& data, size_t& from, char end)
{
        if (from >= data.size() || data[from] == end)
        {
                throw CalculationException("Loaded invalid data " + data);
        }

        stack.clear();
        string item;

        do {  // main processing cycle
                char ch = data[from++];
                if (stillCollecting(item, ch))
                { // the char still  belongs to the previous operand
                        item += ch;
                        if (from < data.size() && data[from] != end)
                                continue;
                }
                double value = theFunctions->getValue(data, from, item, ch);
                char sign = (end != '\n' && validSign(ch)) ? ch : updateSign(data, from, ch);
                stack.push_back(Cell(value, sign));
                item.clear();
        }       while(from < data.size() && data[from] != end);

        if (from < data.size() && data[from] == ')')
        { // end of parenthesis: move one char forward
            from++;
        }

        return calculate();
}
Let’s see how it works via an example.
Suppose we have the following mathematical expression for evaluation:

exp(1+(15-2*2^3))

i.e. the load() function is called with the following arguments:

load(“exp(1+(15-2*2^3))”, 0, ‘\n’);

Inside of the load function we start parsing this string character till we find the end of token condition, which is the following (“item” is the newly created token, “ch” is the current character, i.e. we start with item = “”, ch = ‘e’, then, on the next loop, item = “e”, ch = ‘x’, then item = “ex”, ch = ‘p’, and so on):

bool EZParser::stillCollecting(const string& item, char ch)
{
        return (item.size() == 0 && (ch == '-' || ch == ')')) ||
          (ch != '*' && ch != '/' && ch != '+' && ch != '-' && ch != '^' && ch != '(' && ch != ')');
}

In other words we stop creating a new token as soon as we find parenthesis or an arithmetical operation. EZParser::stillCollecting will return false when item = “exp” and ch = ‘(‘.

Next we check if the new token is one of the predefined functions.
All the functions are defined in a different class, EZParserFunctions:

class EZParserFunctions
{
        public:
                typedef double (EZParserFunctions::*ptrFunc)(const string& arg, size_t& from);
                EZParserFunctions();

                virtual double getValue(const string& data, size_t& from, const string& item, char ch);

                double exp(const string& arg, size_t& from);
                double abs(const string& arg, size_t& from);
                double log(const string& arg, size_t& from);
                double avg(const string& arg, size_t& from);
                double sin(const string& arg, size_t& from);
                double cos(const string& arg, size_t& from);
                double tan(const string& arg, size_t& from);
                double sqrt(const string& arg, size_t& from);
                double round(const string& arg, size_t& from);
                double identity(const string& arg, size_t& from);
                double pi(const string& arg, size_t& from);

            static double strtod(const string& str);

        protected:
                map<string, ptrFunc> functions;
};

Here are some of the implementations:

EZParserFunctions::EZParserFunctions()
{
        functions["exp"]   = EZParserFunctions::exp;
        functions["log"]   = EZParserFunctions::log;
        functions["sin"]   = EZParserFunctions::sin;
        functions["cos"]   = EZParserFunctions::cos;
        functions["sqrt"]  = EZParserFunctions::sqrt;
        functions["abs"]   = EZParserFunctions::abs;
        functions["round"] = EZParserFunctions::round;
        functions["pi"]    = EZParserFunctions::pi;
}

double EZParserFunctions::identity(const string& arg, size_t& from)
{
        EZParser sub;
        return sub.load(arg, from, ')');
}
double EZParserFunctions::pi(const string& arg, size_t& from)
{
        return 3.141592653589793;
}
double EZParserFunctions::exp(const string& arg, size_t& from)
{
        EZParser sub;
        return ::exp(sub.load(arg, from, ')'));
}
The main advantage of this class is that it has a map of string tokens to a pointer to an actual function implementation. Note that a map from a string to a function pointer is quite efficient when the number of elements is small (besides of another great advantage of being a standard C++ data structure).

Returning back our example, we are in EZParser::load function, in the call on line 15:

    double value = theFunctions->getValue(“exp(1+(15-2*2^3))”, 4, “exp”, ‘(‘);
The implementation of the called virtual function is the following:

double EZParserFunctions::getValue(const string& data, size_t& from, const string& item, char ch)
{
        if (item.empty() && ch == '(')
        {
                return identity(data, from);
        }
        map<string, ptrFunc>::const_iterator it = functions.find(item);
        return it == functions.end() ? strtod(item) : (this->*(it->second))(data, from);
}
This function will find EZParserFunctions::exp function in the functions map, i.e. we will return the following call from the above function:

    return this->exp((“exp(1+(15-2*2^3))”, 4);  // last line of EZParserFunctions::getValue

Note that if we did not have a definition of the exp function in EZParserFunctions file, the call to EZParserFunctions::strtod(“exp”) would have been made, which would’ve thrown an exception:

double EZParserFunctions::strtod(const string& str)
{
    char* x = new char[str.size() + 1];
    double num = ::strtod(str.c_str(), &x);
    if (::strlen(x) > 0)
    {
          throw EZParser::CalculationException("Could not parse token [" + str + "]");
    }
    return num;
}
But in our case, inside of EZParserFunctions::exp() function, we will first need to load the argument for the exponent. This will be done via defining a new EZParser object and calling its load method with the following arguments:

    sub.load(“exp(1+(15-2*2^3))”, 4, ')');  // line 2 of EZParserFunctions::exp() function

i.e. load() function will start parsing the string from the 4th element (starting counting from 0), i.e. the working substring will be “1+(15-2*2^3))”.

Inside of the load() function we will get the first token as “1”, and the sign as ‘+’. So the following Cell object will be created and pushed to the stack:

    double value = strtod(“1”); // = 1; line 15 of EZParser::load() function
    stack.push_back(Cell(1, ‘+’)); // line 17 of EZParser::load() function
Let’s take a brief look at line 16 of the load function. What it does is the following: if the current character after the token is a valid sign (as it is in our case, since the sign ‘+’ follows immediately number 1) it just takes it. But if it were not a valid sign (e.g. consider expression “(1+2)+3”: after parsing the second token “2” the sign is ‘)’ which is an invalid sign, so the updateSign function would have been called which would’ve examined the string till it found a valid sign (‘+’ in our case)). Note that the last token in an expression always has an invalid sign, but as we will see later this is OK. Here are the definitions of the functions for finding a token’s sign:

char EZParser::updateSign(const string& item, size_t& from, char ch)
{
        if (from >= item.size() || item[from] == ')')
        {
                return ')';
        }
        size_t tmp = from;
        char res = ch;
        while (!validSign(res) && tmp < item.size())
        {
                res = item[tmp++];
        }

        from = validSign(res) ? tmp : tmp > from ? tmp - 1 : from;
        return res;
}

bool EZParser::validSign(char ch)
{
        return ch == '*' || ch == '/' || ch == '+' || ch == '-' || ch == '^';
}
Now let’s continue parsing our string in the EZParser::load() function. Since the condition on line 19 is true, we will continue the cycle and after line 8 we will have ch = ‘(‘ and from =  7. So stillCollecting(“”, ‘(‘) function on line 9 will return false.
Then the call to

    theFunctions->getValue(“exp(1+(15-2*2^3))”, 7, “”, ‘(‘) // line 17 of EZParser::load():
will return a special case, an “identity” function (line 3 of EZParserFunctions::getValue function). This function is used to process and change in place everything that is between parentheses (since they have the highest precedence).
Inside of EZParserFunctions::identity function we create a new EZParser for our substring and load it with the following call:

    sub.load(“exp(1+(15-2*2^3))””, 7, ')');
Basically that means that we are parsing “15-2*2^3))” substring. This substring now does not have any subfunctions so we’ll get a stack consisting of four Cell objects constructed via subsequent calls to the lines 20 and 21of the EZParser::load() function. The stack will contain these four elements:
Cell (15, ‘-‘), Cell(2, ‘*’), Cell(2, ‘^’), Cell(3, ‘)’).
Then the EZParser::calculate() function will be called from the line 26 of EZParser::load():

double EZParser::calculate()
{
        if (stack.size() == 0)
        {
                throw EZParser::CalculationException("Empty stack for calculation");
        }
        Cell base = stack.at(0);
        size_t index = 1;

        while(index < stack.size())
        {
                Cell next = stack.at(index++);
                if (!(base.sign == '^' || ((base.sign == '*' || base.sign == '/') && next.sign != '^') ||
                        next.sign == '+' || next.sign == '-' || next.sign == ')'))
                {
                        calculateNext(base, next, index);
                }

                mergeCells(base, next);
        }
        return base.value;
}
The main logic of this function is on lines 10-11. If we cannot perform a mathematical operation right away, i.e. if the second sign has a higher precedence than the current one, (as it is in our case when working with the first two cells, Cell (15, ‘-‘), Cell(2, ‘*’), where ‘*’ has a higher precedence as ‘-‘), we need to move on and try to merge the next two cells using the following function:

void EZParser::calculateNext(Cell& base, Cell& current, size_t& index)
{
        if (index >= stack.size())
        {
                return;
        }

        size_t tmp = index;
        Cell next = stack.at(index++);

        if (!(base.sign == '^' || ((base.sign == '*' || base.sign == '/') && next.sign != '^') ||
                next.sign == '+' || next.sign == '-' || next.sign == ')'))
        {
                calculateNext(current, next, index);
        }
        mergeCells(current, next);
}
So we won’t be able to merge cells when working with cells Cell (15, ‘-‘) and Cell(2, ‘*’) and with Cell(2, ‘*’) and  Cell(2, ‘^’). First time the function mergeCells will be called is when merging the last two cells: Cell(2, ‘^’), Cell(3, ‘)’). The mergeCells function is doing actual computation:

void EZParser::mergeCells(Cell& base, Cell& next)
{
        switch(base.sign)
        {
                case '^' : base.value = ::pow(base.value, next.value);
                        break;
                case '*' : base.value *= next.value;
                        break;
                case '/' :
                        if (next.value == 0)
                        {
                                throw CalculationException("Division by zero");
                        }
                        base.value /= next.value;
                        break;
                case '+' : base.value += next.value;
                        break;
                case '-' : base.value -= next.value;
                        break;
        }
        base.sign = next.sign;
}
The result of the calculation will be saved in the “base” cell – note that the “sign” of the base cell is updated after merging cells.
Basically that’s it. Now we will be “unwinding” the stack: calling mergeCells() with Cell(2, ‘^’) and Cell(3, ‘)’) will return the base Cell(8, ‘)’), then calling mergeCells() with Cell(2, ‘*’) and Cell(8, ‘)’) will return the base Cell(16, ‘)’), and finally calling mergeCells() with Cell (15, ‘-‘) and Cell(16, ‘)’) will return the base Cell(-1, ‘)’).

So our call to sub.load(“exp(1+(15-2*2^3))””, 7, ')'); from the EZParserFunctions::identity function will return -1.
Now we have just two cells in our stack: remember than we already have Cell(1, ‘+’) on the stack which combined with our resulting Cell(-1, ‘)’) will give us 1 -1 = 0.
That means that we found our argument for the exp function (line 2 of EZParserFunctions::exp function). So the final result will be ::exp(0) = 1.

Finally here is a simple main function to test the parser (the parsing string is the last argument to this function when calling it from the command line).

int main(int argc, char* argv[])
{
    string str = argv[argc - 1];
    try
    {
            EZParser ts;
                double res = ts.calculate(str);
                cout << "Got result: " << str << " -> " << res << endl;
        }
        catch(EZParser::CalculationException& e)
        {
                cout << "Calculation Exception for " << str << ": " << e.what() << endl;
        }
        catch(std::exception& e)
        {
                cout << "Standard Exception for " << str << ": " << e.what() << endl;
        }
        return 0;
}

The whole file is located here and is tested using a GNU compiler on Linux.

 

Conclusions


There are several things that should be considered in the above presented algorithm.
  • After getting a token first we checked that the token corresponds to a function name and if not we check that the token can be parsed to a number. Maybe the order should be switched if it is known in advance that the string contains mostly numbers.
  • There is a separate calculation of opened and closed brackets and removal of blanks from the passed string (function EZParser::preprocess). These operations can be combined inside of the EZParser::load() procedure for more efficient processing. But the disadvantage of this is that there will be a slight performance loss for strings that are known to be “OK” and don’t require pre-processing.
  • Currently adding a new functionality (e.g. new functions, constants, differentiation/integration, etc.) requires either supplying a new class derived from EZParserFunctions and overriding getValue() function or modifying EZParserFunctions: adding a new function definition and a new function mapping. This part could be made easier by supplying a new convenience function or idiom.

Friday, December 10, 2010

"Constants" Framework in .Net

Motivation:
  • Constants are spread all over the code, in different files
  • Sometimes they are defined in Constants.cs or AllConstants.cs 
  • Some are defined as constants in the file it is used
  • Some are defined right before being used
  • Sometimes they are not defined at all, just used in place as a number or as a string
  • You have to be familiar with the code in order to tell the value of a particular constant and/or change it
  • Same constants are defined multiple times in different files
  • To make an existing constant being read from a config file you need to write some non trivial code
With the new Framework
  • All constants are defined in a single XML file
  • The framework introduces a very consistent way of declaring and using constants. Strong typing remains ensured. Code duplication is reduced.
  • With the new framework to make a constant editable is as easy as changing the mode from “const” to either mode=“runtime” (always possible to change the value of a “constant” via modifying a configuration XML file and restarting the service) or mode=”runtimeTest” (you can modify the constants at the runtime for testing and debug builds, but you won’t be able to do that in optimized non-test builds).
  • The consumer of a constant does not need to worry about the fact that the value might come from a config file. In other words we get “configurable” constants for free. Some might argue that constants should not be editable, but in my experience this becomes useful for testing/debugging. Playing with timeout/retry values could help expose deadlocks or leaks. A leak that takes 1 day to show up with an interval of 2 minutes could appear after 15 minutes with a 1 second interval.
  • It becomes very easy to get an overview of the constants being used.
How does it work?
All the work is done by the ConstGen.xslt XML conversion: it has been implemented and can be reused by all projects: get it from here.

As the entry argument this XSLT file has an XML project definition file. In difference to the XSLT file above which can be reused for all the projects, each project must define an XML constants definition file. Here is an example:

<?xml version="1.0" encoding="utf-8" ?>
<constants>
  <constant namespace="MyCompany.MyProject" managedName="Constants" access="internal" isStatic="yes">
    <description>All constants for Server.</description>
    <fields>
      <!-- Constants -->
      <field name="VERSION" type="string" value='1.0' mode="runtimeTest" configName="Version" access="internal">
        <description>Current Server version.</description>
      </field>
      <field name="TRANSFER_TIMEOUT" type="int" value="30 * 1000" mode="runtimeTest" configName="Transfer" access="internal">
        <description>Time in millisecods for how long the transfer operation can take.</description>
      </field>
    </fields>
  </constant>

  <constant namespace="MyCompany.MyProject.Common" managedName="Constants" isStatic="yes">
    <description>Common constants for the Call Park Server and Music on Hold functionality.</description>
    <fields>
      <field name="NUMBER_THREADS" type="int" value="Environment.ProcessorCount" mode="runtime" configName="NumberThreads" access="internal">
        <description>Number of running threads in parallel</description>
      </field>
      <field name="RETRY_INTERVAL" type="int" value="3 * 60 * 1000" mode="const" access="internal">
        <description>The retry interval for Enpoint.Establish() in milliseconds.</description>
      </field>
      <field name="RETRY_MAX_ATTEMPTS" type="int" value="10" mode="const" access="private">
        <description>Maximum number of retries to reserve an orbit.</description>
      </field>
    </fields>
  </constant>
</constants>
     
    Next you run XSLT giving as an entry argument the constants definition XML file. As a result a real C# constants file is created. For the XML file above it will look like this.
   
    Now, how do we make this preprocessing step of conversion of an XML file to C# happen? The easiest is to use Visual Studio: it has a built-in XSLT support, so just click on Execute button on opened XSLT file (the one above), it will ask you for an entry XML file, and that's it.
But of course you will want to automate this step. Then, what you need to do is to download MSXSL tool and run it before compiling your project.

If you want to change a value of a constant at runtime, change the corresponding value of the MyApp.exe.config file. Then restart the service. This file should be placed under the same directory where your executable is. Here is an XSLT file that you can use to create the configuration file where you can change the constant values. I.e. "constant" values now will be read from a configuration file. They still will be constants since they will remain same through the execution of the program and you need to restart the program if you change the "constant" value in the configuration file.

The configuration file created by running the XSLT file above and the same XML file will be the following:

<?xml version="1.0" encoding="UTF-8" ?>
<configuration>
  <appSettings>

    <add key="Version" value="1.0"/>
    <add key="Transfer" value="30 * 1000"/>
    <add key="NumberThreads" value="Environment.ProcessorCount"/>

  </appSettings>
</configuration>

Tuesday, November 23, 2010

Moscow, MS Platforma 2011

I just returned from MS Platforma Conference in Moscow.

It is the biggest IT event in Russia - to popularize Microsoft Software, solutions, and talk about what's new among Russian and a few former "CCCP" IT specialists.

This time it coincided with Microsoft Lync launch - I am also a developer for Lync at Microsoft Zurich Development Center. I participated in two cojoint talks about Lync: gave a demo on Lync Architecture session and talked about dial-in conferencing in another session.

The conference was greatly organized and was an excellent experience.
Hope to be invited to go there next year as well:)

Here are some pics from the sessions:





There is a good blog by Sergey Trofimov, who participated in this event about this conference.

Monday, November 8, 2010

Uploading Files to Google App Engine

In previous post we talked about sending emails with attachments from the Google App Engine.
But how do we get files from a client to the cloud?

Here is a sample code in C#:

public static string UploadFilesToRemoteUrl(string url, string[] files, NameValueCollection nvc, string cookie)
{
    string boundary = "----------------------------" +
    DateTime.Now.Ticks.ToString("x");

    HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create(url);
    httpWebRequest.ContentType = "multipart/form-data; boundary=" + boundary;
    httpWebRequest.Method = "POST";
    httpWebRequest.KeepAlive = true;
    httpWebRequest.Headers["Cookie"] = cookie;
    httpWebRequest.Credentials = System.Net.CredentialCache.DefaultCredentials;

    Stream memStream = new MemoryStream();

    byte[] boundarybytes = Encoding.ASCII.GetBytes("\r\n--" + boundary + "\r\n");

    string formdataTemplate = "\r\n--" + boundary +
        "\r\nContent-Disposition: form-data; name=\"{0}\";\r\n\r\n{1}";

    foreach (string key in nvc.Keys)
    {
        string formitem = string.Format(formdataTemplate, key, nvc[key]);
        byte[] formitembytes = System.Text.Encoding.UTF8.GetBytes(formitem);
        memStream.Write(formitembytes, 0, formitembytes.Length);
    }

    memStream.Write(boundarybytes, 0, boundarybytes.Length);
    string headerTemplate = "Content-Disposition: form-data; name=\"{0}\";filename=\"{1}\"\r\n\r\n";

    for (int i = 0; i < files.Length; i++)
    {
        string header = string.Format(headerTemplate, "file" + i, files[i]);
        byte[] headerbytes = Encoding.UTF8.GetBytes(header);

        memStream.Write(headerbytes, 0, headerbytes.Length);

        FileStream fileStream = new FileStream(files[i],
            FileMode.Open, FileAccess.Read);
        byte[] buffer = new byte[1024];

        int bytesRead = 0;

        while ((bytesRead = fileStream.Read(buffer, 0, buffer.Length)) != 0)
        {
            memStream.Write(buffer, 0, bytesRead);

        }
        memStream.Write(boundarybytes, 0, boundarybytes.Length);
        fileStream.Close();
    }

    httpWebRequest.ContentLength = memStream.Length;

    Stream requestStream = httpWebRequest.GetRequestStream();

    memStream.Position = 0;
    byte[] tempBuffer = new byte[memStream.Length];
    memStream.Read(tempBuffer, 0, tempBuffer.Length);
    memStream.Close();
    requestStream.Write(tempBuffer, 0, tempBuffer.Length);
    requestStream.Close();

    WebResponse webResponse = httpWebRequest.GetResponse();

    Stream stream = webResponse.GetResponseStream();
    StreamReader reader = new StreamReader(stream);

    string responseFromServer = reader.ReadToEnd();

    webResponse.Close();
    reader.Close();
    stream.Close();
    httpWebRequest = null;
    webResponse = null;

    return responseFromServer;
}

The method above has a cookie as a parameter. Here is an excellent site by Dale Lane where you can see how to access authenticated Google App Engine service from a C# client.

Sunday, November 7, 2010

How to Send Email with Attachments Using Google App Engine

The cool thing about Google App Engine is that you can send e-mails from there for free (well, up to 2000 per day). You can even send attachments with it!

How do you do that? Here is a code snippet. Note that two cases are dealt here, one when there are attachments (so a multi part message is created) and the simplest case where there no attachments:


private static String SendEmail(String user, String to,
    String cc, String subject, String body, String filename,
    String mimeType, byte[] rawData)
{
    Properties props = new Properties();
    Session session = Session.getDefaultInstance(props, null);

    MimeMessage message = new MimeMessage(session);

    try {
        message.setFrom(new InternetAddress(user));
        message.addRecipient(Message.RecipientType.TO,
                             new InternetAddress(to));
        if (cc != null && !cc.trim().isEmpty()) {
              message.addRecipient(Message.RecipientType.CC,
                             new InternetAddress(cc));                    
        }
        message.setSubject(subject, "UTF-8");

        if (rawData != null && filename != null &&
           !filename.isEmpty()) {
             if (mimeType == null || mimeType.isEmpty()) {
                 mimeType = GetMimeType(filename);
             }

             Multipart mp = new MimeMultipart();
             MimeBodyPart htmlPart = new MimeBodyPart();

             htmlPart.setContent(body, "text/html");
             mp.addBodyPart(htmlPart);
             ByteArrayDataSource dataSource = new
                 ByteArrayDataSource(rawData, mimeType);
     
             MimeBodyPart attachment = new MimeBodyPart();
             attachment.setFileName(filename);
             attachment.setDataHandler(
                 new DataHandler(dataSource));                                  
             mp.addBodyPart(attachment);

             message.setContent(mp);
         }
         else {
             message.setText(body, "UTF-8");
         }

         Transport.send(message);
     } catch (Exception e) {
          return "Error: " + e.getMessage();
     } 
     return "ok";
}

Wednesday, November 3, 2010

How to write a client to access Google App Engine with GWT

This tip is needed when you create a new Google App Engine project using Eclipse and choose the default settings, in particular when you leave Google Web Toolkit box checked (no trick needed if you uncheck it). If you always uncheck Google Web Toolkit you won't need this. You may still find useful the below example on how to access Google App Engine from a C# client.

The problem is that the framework will create for you all the skeletons for accessing the App Engine from the browser. But what if in addition you want to access the App Engine from a custom made client?

Passing regular parameters will not help - they won't be parsed correctly by the Google App Engine framework.
You need a custom parameter passed, that looks like this:

"5|0|6|http://1.latest.myapp.appspot.com/myapp/|SOMEGUID|myapp.clientGreetingService|greetServer|java.lang.String/2004016611|ThisIsAParameterValue|1|2|3|4|1|5|6|";


Here is the client in C# accessing Google App Engine (created from Eclipse with Web Toolkit option): 

WebRequest request = WebRequest.Create(   "http://1.latest.myapp.appspot.com/appname/greet");
request.Credentials = CredentialCache.DefaultCredentials;
request.Method = "POST";
request.ContentType = "text/x-gwt-rpc; charset=utf-8";

string content = "5|0|6|http://1.latest.myapp.appspot.com/myapp/|AF76B9CA2759BDA4DBA207D5542054B2|myapp.clientGreetingService|greetServer|java.lang.String/2004016611|ThisIsAParameterValue|1|2|3|4|1|5|6|";

byte[] contentBytes = System.Text.UTF8Encoding.UTF8.GetBytes(content);
request.ContentLength = contentBytes.Length;
using (Stream stream = request.GetRequestStream())
{
    stream.Write(contentBytes, 0, contentBytes.Length);
}

WebResponse response = request.GetResponse();
Stream dataStream = response.GetResponseStream();
StreamReader reader = new StreamReader(dataStream);
string responseFromServer = reader.ReadToEnd();
Console.WriteLine(responseFromServer);

reader.Close();
response.Close(); 

In a nutshell, what is important is to provide the servlet name (greet server as in the example above) and then parameter types and values. Other values are not important, the Google App Engine parser just needs the number and placement of pipes.