Mar
7
2008

Indexing and searching business entities using Lucene.Net Framework, part 3

Conception using generics and reflection of a search engine to index and search content in your business entities without being intrusive.

Part 1 and 2 are available following those links

  1. Indexing and searching business entities using Lucene.Net Framework, part 1
  2. Indexing and searching business entities using Lucene.Net Framework, part 2

Solution’s architecture

The main idea is to be able to define the business entity’s properties that must be indexed when this one is saved or updated in the chosen persistence system.

With the goal to be the less intrusive possible in our model we come fast to the idea that we need to extend our business entities with meta-data. The issue then is that at runtime it is needed to know which meta-data needs to searched in the entity in order to be able to index the content of the decorated property.

As one of the goal is to have a Framework which manage the indexation and the searching of whatever business entity, we might have wrote a simple class inheriting from System.Attribute in an assembly separated from our domain. That would have the drawback of behind much intrusive in our domain. Another solution was needed.

As we have seen the developed Framework needs to know the meta-data, giving it the opportunity to index the content of the property at runtime. This means that at development time it is absolutely possible to generalize this information by using the generics of the .NET Framework 2. As we are talking about meta-data the only imposed thing is that our class inherits from System.Attribute.

The choice was made then to define a utility class in the domain assembly inheriting from System.Attribute which will serve us as a decorator of the entity’s properties needing to be indexed.

On the following picture you can see an example of the domain for an application to which we have added our attribute SearchableAttribute used to decorate the Post and Page classes:

The Visual Studio solution is organized as a Domain Driven Development solution:

We have so defined the new attribute SearchableAttribute in the assembly innoveo.Blog.Domain.

Here is the description of the organization of our solution:

  • innoveo.Blog.DAL: Data access layer using Euss OR/M mapping tool
  • innoveo.Blog.Domain: Assembly containing our domain business entities
  • innoveo.Blog.Services: Layer exposing the different business services
  • innoveo.Blog.Web: Web presentation & web services layer
  • Blog: The web application

Here it is for our solution that will use our business entities indexing Framework. Let’s have a closer look now at the Framework itself!

Indexing Framework

First here is the class diagram:

The role of each class of our Framework is as following:

  • EntityIndexer manage an index and index the business entities
  • EntitySearcher let you search business entities
  • EntityDocument is used by the class EntityIndexer in order to manage Lucene.Net Document
  • IndexPath is an utility class used to specify the location of index

As you can see on the diagram we use the .NET Frameworks 2 generics this in order to allow us to search whatever attribute decorating our business entities. But also to be able to have a Framework that is not dependant of any entities. This brings a good flexibility at the usage time as it let you index whatever property of type string of whatever business entity. All of this is without being intrusive in our model.

Now that we know about the architecture of our Framework it is time to look deeper in the details of the implementation.

This post is cross-posted on innoveo blog and in French on my .NET community portal Tech Head Brothers.

Mar
7
2008

Indexing and searching business entities using Lucene.Net Framework, part 2

Conception using generics and reflection of a search engine to index and search content in your business entities without being intrusive.

Part 1 is available following this link Indexing and searching business entities using Lucene.Net Framework, part 1

Lucene.Net presentation

Lucene.Net is an open source project coming from the Java world currently incubating at the Apache Software Foundation (ASF). It is a source code port on the .NET platform using C#, done class-by-class, API-per-API, of the indexing and searching engine algorithms of Java Lucene.

Apache Lucene is an efficient indexing and searching engine for text data. However it is not offering integrated support for document like Office Word or PDF, you need to use extensions able to extract the text content of a document in order to be able index it. This is also mandatory for markup documents like HTML.

Lucene.Net follows scrupulously the APIs defined in the classes of the original Lucene Java version. The API names as well as the class names are preserved with the intention to follow naming guidelines of the C# language. For example, the method Hits.length() of the Java implementation is written Hits.Length() in its C# version.

Like the port of the APIs and the classes in C#, the algorithm of the Java version of Lucene is also ported in the C# version. This means that an index created using the Java version of Lucene is 100% compatible with it C# version, in reading, writing and updating. Therefore two processes, one written in Java and the other in C#, could achieve concurrent searches using the same index.

You might consult the documentation of the last stable version, version 2.0, on the following page. To download the last stable version browse to this page. To get more information about Lucene I recommend using the pages dedicated to the Java version of Lucene which are much more consistent.

Lucene.Net Architecture

Lucene.Net Architecture

The lower layer is the data access layer (Storage). Then, the upper layer is about accessing the index files (data access). This layer is used by the indexing system and the searching system. On top of those we find a layer for searching and a search request parser layer used by the searching part of Lucene.Net. Identically we found a parser layer and a document layer used for the indexation part of Lucene.Net.

To get more information about Lucene I recommend reading the presentation on Lucene website.

Now that we got a better view on what is Lucene.Net about we will see in the next part how we will use it to index the properties of our business entities.

This post is cross-posted on innoveo blog and in French on my .NET community portal Tech Head Brothers.

Dec
2
2007

Setting correctly your configuration for Web Deployment Projects VS08

I read on MSDN Forums that it is not possible to use other configuration than Release and Debug. This is wrong.

I am using it for Tech Head Brothers Portal, what you have to take care of is how you set your configuration. On the following picture you see how I defined a new Staging configuration and set the Web Deployment Project WebApplication.csproj_deploy to build using this configuration.

Then you are able to use it this way in your MSBuild:

<PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Staging|AnyCPU' ">
    <DebugSymbols>true</DebugSymbols>
    <OutputPath>.\Staging</OutputPath>
    <EnableUpdateable>true</EnableUpdateable>
    <UseMerge>true</UseMerge>
    <SingleAssemblyName>THB.Portal</SingleAssemblyName>
</PropertyGroup>
Dec
2
2007

Web Deployment Projects for VS08 released as CTP &amp; Migration tips

My old build script wasn't working after the migration to this new version and you will find in this post the different adaptation that I had to do.

Add missing ToolsVersion as following:

<Project DefaultTargets="Build" 
         xmlns="http://schemas.microsoft.com/developer/msbuild/2003" 
         ToolsVersion="3.5">
Replace
<Import Project="$(MSBuildExtensionsPath)\Microsoft\WebDeployment\v8.0\Microsoft.WebDeployment.targets"/>

with

<Import Project="$(MSBuildExtensionsPath)\Microsoft\WebDeployment\v9.0\Microsoft.WebDeployment.targets"/>

I also had to add the following to the beginning of my AfterBuild task:

<Target Name="AfterBuild">
  <CreateItem Include="$(TempBuildDir)\**\*.*">
    <Output ItemName="CompiledFiles" TaskParameter="Include" />     
  </CreateItem>
  <Exec Command="if exist &quot;$(WDTargetDir)&quot; rd /s /q &quot;$(WDTargetDir)&quot;" />
  <Exec Command="if not exist &quot;$(WDTargetDir)&quot; md &quot;$(WDTargetDir)&quot;" />
  <Copy SourceFiles="@(CompiledFiles)" DestinationFolder="$(WDTargetDir)\%(CompiledFiles.SubFolder)%(CompiledFiles.RecursiveDir)" />
  <Exec Command="if exist &quot;$(TempBuildDir)&quot; rd /s /q &quot;$(TempBuildDir)&quot;" />
  ...
</Target>

Otherwise I ended up with TempBuildDir folder with a part of my solution. This is just a copy of what is defined in Microsoft.WebDeployment.targets.

Now everything works like before, I can:

  1. set Staging as my active solution configuration
  2. Run a compilation
  3. Zip the merged result of the build
  4. Start uploading using ssh to the target stage server

Nice!

Nov
16
2007

Indexing and searching business entities using Lucene.Net Framework, part 1

Conception using generics and reflection of a search engine to index and search content in your business entities without being intrusive.

Introduction

Today, one of the functionality that almost all web sites implements is a method to index content and give it users the possibility to search that content spread into its web pages. It is one of the simplest ways to improve the user experience on your web site.

Blogs brought categories/tags giving the possibility to label the information. However this advantageous method isn’t always sufficient. It is advisable to then use a real content indexing method.

In this set of posts I propose to take a look at the indexing and searching method I implemented on the web site of innoveo solutions, my new company. I hope also to bring soon this system to my web site Tech Head Brothers.

Both web sites, innoveo solutions and Tech Head Brothers, were developed using Domain Driven Design. So, we started by defining a domain model with our business entities. In this layer we do not concentrate on technical aspects for example like persistence. On the other hand we do concentrate on the domain we want to address.

One of the main ideas is to avoid being intrusive in the domain model with any inheritance of technical classes or to link this layer with any technical frameworks.

To achieve this goal we will use an O/R mapping tool (Euss) for the business entities persistence as well as the Lucene.Net framework for the indexing part.

Following quite some discussions (Thanks Didier ;) in which we asked us if we would better use a service offered by one of the searching big players on the Internet, we finally decided to keep the control of our searching tool.

Wanting to be independent of any database and services like Full-Text indexing, or from services like Indexing Services, we decided to use Lucene.Net to avoid having to re-implement everything from scratch.

In the following posts, I will present an introduction of Lucene.Net; we will see the architecture I have chosen for the indexing and searching framework; the implementation details of that framework and finally an example of integration into a data access layer.

This post is cross-posted on innoveo blog and in French on my .NET community portal Tech Head Brothers.

Aug
19
2007

Yahoo release a web site evaluation tool called YSlow

YSlow is a new web tool published by Yahoo. It let you test a total of 13 rules against your web site to check if it is efficient.

YSlow is an addin to Firefox integrated in the web development tool Firebug.

You might read the rules that are verified on this page : Thirteen Simple Rules for Speeding Up Your Web Site and the documentation is on this page : YSlow User Guide.

There is also a screencast on this page : YSlow Podcast Interview and Screencast Demo.

Performance view

YSlow analyze the web page and generate a global mark and one mark per rules tested.

This is the gui of YSLow showing the result of the analysis of Tech Head Brothers main page. You can see the global mark of C (not so bad ;) and the different marks per rule. For example a D for the rule "Make fewer HTTP requests". You can opne up each section that didn't get a A to see what might be better.

We can see on the previous picture that YSlow identified that we are doing 11 requests for different JavaScript files. When you click on the rule you get a new web page with the explanation of the rule.

Stats view

YSlow compute the total size of a web page with and without the use of the cache and also give you some information on cookies.

You can see clearly the impact of the cache on your web application. On the previous screen shot you see that the browser needs to download 127.3 Kb with 21 HTTP requests without the cache and this numbers drops to 28.2 Kb and 5 HTTP requests with.

Components view

You get also a complete list of all the components of your page, included their type, URL, expiration date, gzip status, loading time , size and ETag.

Menu tools

The last menu let's you display the whole Javascript and CSS used by your web page to get a global view. It gives you also a web page with the result of the test on one page. The last part is JSLint checking the Code Conventions for the JavaScript Programming Language.

 

YSlow is a tool that might be usefull when you come to the optimization of your web site.

Aug
4
2007

ASP.NET AJAX and URL rewriting issue

If you are using URL rewriting you might know that you have to take care about the way you reference resources has written in the Scott Guthrie post; Tip/Trick: Url Rewriting with ASP.NET:

Handling CSS and Image Reference Correctly

One gotcha that people sometime run into when using Url Rewriting for the very first time is that they find that their image and CSS stylesheet references sometimes seem to stop working.  This is because they have relative references to these files within their HTML pages - and when you start to re-write URLs within an application you need to be aware that the browser will often be requesting files in different logical hierarchy levels than what is really stored on the server.

For example, if our /products.aspx page above had a relative reference to "logo.jpg" in the .aspx page, but was requested via the /products/books.aspx url, then the browser will send a request for /products/logo.jpg instead of /logo.jpg when it renders the page.  To reference this file correctly, make sure you root qualify CSS and Image references ("/style.css" instead of "style.css").  For ASP.NET controls, you can also use the ~ syntax to reference files from the root of the application (for example: <asp:image imageurl="~/images/logo.jpg" runat="server"/>

This is for sure also the case for javascript.

I am using the Request.PathInfo way described in Scott's post to rewrite one url on Tech Head Brothers. Everything works fine except that Sys.Services.AuthenticationService get confused about the rewriting of the URL and tries to post back on :

http://localhost:8080/Auteurs.aspx/laurent-kempe/Authentication_JSON_AppService.axd/Login

When I expect

http://localhost:8080/Authentication_JSON_AppService.axd/Login

Looking at the page rendered by ASP.NET I see that the following is rendered:

<script type="text/javascript">
<!--
Sys.Services._AuthenticationService.DefaultWebServicePath = 'Authentication_JSON_AppService.axd';
// -->
</script>

So I am clearly missing a / in the path and due to that the URL rewriting confuse the post to the server.

The first solution was found by Cyril Durand (always of good help in this AJAX world ;) and is to add this line of code:

ScriptManager.GetCurrent(Page).AuthenticationService.Path = "/Authentication_JSON_AppService.axd";

But I did it a bit differently, directly in the javascript adding the following line:

Sys.Services.AuthenticationService.set_path('/Authentication_JSON_AppService.axd');

Btw this javascript line would be generated at rendering time by the solution of Cyril.

Thanks Cyril for the always nice talks.

Jul
7
2007

Starting ASP.NET Development Server from a right click in explorer

Update: Following the comment of Jon Galloway

I know that there are other solutions doing this but I was just asked about how to do it today (hey Christine ;) and had this registry file stored somewhere for a while waiting for a blog post.

Save this to a file with an extension .reg, e.g. "asp.net web server here.reg":

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\Classes\Folder\shell\VS2005 WebServer]
@="ASP.NET 2.0 Web Server Here"

[HKEY_LOCAL_MACHINE\SOFTWARE\Classes\Folder\shell\VS2005 WebServer\command]
@="C:\\Windows\\Microsoft.NET\\Framework\\v2.0.50727\\Webdev.WebServer.exe /port:9081 /path:\"%1\""

You might also change the port used to start the ASP.NET Development Server. If this option is not specified then the port 80 is used.

Then double click on the .reg file to save the settings in the registry.

Now you will have access to the following right click menu in explorer and you can browse your site without running Visual Studio:

Update: This is not needed anymore

You will need to copy hstart.exe to your C:\Windows\system32\ first, or change the path used in the .reg file. This little tool (3kb); Hidden start, by NTWind Software is really good for "small startup manager that allows console applications to be started without any windows in the background.". Exactly what is need in our case.

Jun
27
2007

Migration from WSE 3 to WCF

I started to migrate the Tech Head Brothers authoring tool and portal from Web Service Enhancement 3 (WSE 3) to Windows Communication Foundation (WCF). This is a next step in the integration of .NET Framework 3 in Tech Head Brothers portal.

Till today I was using WSE 3 from the Word VSTO solution to securely publish content to the portal directly out of Word 2003/2007.

The migration went straight due to my initial implementation that was already using interfaces and implementation classes. So basically I had to :

  • Remove the reference to WSE 3 and add one to System.ServiceModel
  • Change the attributes on the interface to ServiceContract and OperationContract
  • Update the web.config to parameterize the new WCF endpoint, binding...
  • Regenerate the client proxy and update a bit the client code

This first step took me around 1h30 and was working very good but was still missing all the security of the old version.

Then to implement the security part of the web services:

  • I created a new certificate
  • Removed the Policy using the self developed aspnetUsernameTokenSecurity that is not needed anymore
  • Configured the web.config with a new wcf service behavior using userNameAuthentication and serviceAuthorization linked to ASP.NET providers
  • Replaced code checking the role of the user calling the service with an attribute PrincipalPermission
  • Regenerated the client proxy and reconfigured it

So now I have security at the level of the message that is encrypted using the certificate and at the web service with role access check.

Currently the solution uses attributes to check the role of the user that access the web service. I don't find the solution flexible enough and the next step will be to have the configuration in a configuration file, that would let me change access rights without changing any line of code.

As always when this will work I will publish the source code of the VSTO client authoring tool (Tech Head Brothers Authoring) on the codeplex project: THBAuthoring.

May
5
2007

CSS Adapters issues with Login controls

If you are using ASP.NET 2.0 CSS Friendly Control Adapters 1.0 with the Login controls of ASP.NET 2.0 you might have experienced some issue like multiple postback when using Internet Explorer. You might get a GREAT fix and explanation of the issue from Tana Isaac of Wellington, New Zealand.

Double Postback Problem - Cause (skip this if you just want the fix!):

Buttons that reside within the controls that are adapted by the CSS Control Adapters, for example the CreateUserWizard.CreateUserButton, are rendered out differently depending on the button type (which is set for example via CreateUserWizard.CreateUserButtonType = ButtonType.Link). The default button type used by the membership controls is Button. The following html controls are rendered out for the different System.Web.UI.WebControls.ButtonType enum values:

ButtonType.Button: input, type=submit
ButtonType.Image: input, type=image
ButtonType.Link: anchor

Both of the input controls will automatically cause the form that they reside within to be posted back to the server when they are clicked, whereas the anchor will not - instead it needs some javascript to cause a postback. This is where the problem is - all three html controls are rendered out with javascript attached to post the form back to the server on a click event, which allows buttons of type 'Link' to work correctly but causes buttons of type 'Button' and 'Image' to postback twice - the first time due to the javascript and the second because of the native postback.

The javascript method used to cause the postback is as follows:

WebForm_DoPostBackWithOptions(WebForm_PostBackOptions(eventTarget, eventArgument, validation, validationGroup, actionUrl, trackFocus, clientSubmit))

In order to stop 'Button' buttons and 'Image' buttons firing twice we just need to set the clientSubmit parameter to false when these types of buttons are rendered out.

Other problems (specific to the CreateUserWizardAdapter control)

Once the double postback problem was fixed two other problems popped up. The first was that users still weren't being created. This was because the id and name (which is derived from the id) being used for the create user button was missing an underscore.

The other problem was that the cancel button didn't work. It was also missing an underscore from its name and also wasn't registered for Event Validation.

Read more...

Great work Tana, Thanks for sharing you saved me some time.

Update: The CSS Adapters project is now hosted on Codeplex, http://www.codeplex.com/cssfriendly.

About Laurent

Laurent Kempé

Laurent Kempé is the editor, founder, and primary contributor of Tech Head Brothers, a French portal about Microsoft .NET technologies.

He is currently employed by Innoveo Solutions since 10/2007 as a Senior Solution Architect and certified Scrum Master.

Founder, owner and Managing Partner of Jobping, which provides a unique and efficient platform for connecting Microsoft skilled job seekers with employers using Microsoft technologies.

Laurent was awarded Most Valuable Professional (MVP) by Microsoft from April 2002 to April 2012.

JetBrains Academy Member
Certified ScrumMaster
My status

Twitter

Flickr

www.flickr.com
This is a Flickr badge showing public photos and videos from Laurent Kempé. Make your own badge here.

Month List

Page List