All posts by Jim

Software engineer from Crete living in Switzerland; C# & Azure paladin; economics hobbyist; firearm enthusiast; perpetually tormented by 3 beautiful women :-)

Please don’t write logs inside Program Files (here’s how to do it right)

So the other day I’m troubleshooting a Windows Service that keeps failing on a server, part of a product we’re using in the company. Long story short, that’s what the problem was:

Access to the path 'C:\Program Files\whatever\whatever.log is denied'

I mean, dear programmer, look. You want to write your application’s logs as simple text files. I get it. Text files are simple, reliable (if the file system doesn’t work, you have bigger problems than logging) and they’re shown in virtually every coding tutorial in every programming language. Depending on the case, there might be better ways to do that such as syslog, eventlog and others.

But sure, let’s go with text files. Take the following example somewhere in the middle of a Python tutorial. Look at line 3:

import logging

logging.basicConfig(filename='app.log', filemode='w', format='%(name)s - %(levelname)s - %(message)s')
logging.warning('This will get logged to a file')

Did you notice? This code writes the log in the same place as the binary. It’s not explicitly mentioned and usually you wouldn’t give it a second thought, right?

To be clear, I don’t want to be hard on the writers of this or any other tutorial; it’s just a basic tutorial, and as such it should highlight the core concept. A professional developer writing an enterprise product should know a bit more!

But the thing is, these examples are everywhere. Take another Java tutorial and look at line 16:

package com.javacodegeeks.snippets.core;

import java.util.logging.Logger;
import java.util.logging.FileHandler;
import java.util.logging.SimpleFormatter;
import java.io.IOException;

public class SequencedLogFile {

    public static final int FILE_SIZE = 1024;
    public static void main(String[] args) {

        Logger logger = Logger.getLogger(SequencedLogFile.class.getName());
        try {
            // Create an instance of FileHandler with 5 logging files sequences.
            FileHandler handler = new FileHandler("sample.log", FILE_SIZE, 5, true);
            handler.setFormatter(new SimpleFormatter());
            logger.addHandler(handler);
            logger.setUseParentHandlers(false);
        } catch (IOException e) {
            logger.warning("Failed to initialize logger handler.");
        }
        logger.info("Logging info message.");
        logger.warning("Logging warn message.");
    }
}

Or this Dot Net tutorial, which explains how to set up Log4Net (which is great!) and gives this configuration example. Let’s see if you can spot this one. Which line is the problem?

<log4net>
  <root>
    <level value="ALL" />
    <appender-ref ref="LogFileAppender" />
  </root>
  <appender name="LogFileAppender" type="log4net.Appender.RollingFileAppender">
    <file value="proper.log" />
    <lockingModel type="log4net.Appender.FileAppender+MinimalLock" />
    <appendToFile value="true" />
    <rollingStyle value="Size" />
    <maxSizeRollBackups value="2" />
    <maximumFileSize value="1MB" />
    <staticLogFileName value="true" />
    <layout type="log4net.Layout.PatternLayout">
      <conversionPattern value="%d [%t] %-5p %c %m%n" />
    </layout>
  </appender>
</log4net>

If you answered “7”, congrats, you’re starting to get it. Not using a path -this should be obvious, I know, but it’s easy to forget nevertheless- means writing in the current path, which by default is wherever the binary is.

So this works fine while you’re developing. It works fine when you do your unit tests. It probably works when your testers do the user acceptance testing or whatever QA process you have.

But when your customers install the software, the exe usually goes to C:\Program Files (that’s in Windows; in Linux there are different possibilities as explained here, but let’s say /usr/bin). Normal users do not have permission to write there; an administrator can grant this, but they really really really shouldn’t. You’re not supposed to tamper with the executables! Unless you’re doing some maintenance or an upgrade of course.

So how do you do this correctly?

First of all, it’s a good idea to not reinvent the wheel. There are many, many, MANY libraries to choose from, some of them very mature, like log4net for Dot Net or log4j for Java.

But if you want to keep it very simple, fine. There are basically two ways to do it.

If it’s a UI-based software, that your users will use interactively:

Create a directory under %localappdata% (by default C:\Users\SOMEUSER\AppData\Local) with the brand name of your company and/or product, and write in there.

You can get the localappdata path using the following line in Dot Net:

string localAppDataPath = Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData);

Take for example the screen-capturing software called Greenshot. These guys do it right:

If it’s a non-interactive software, like a Windows Service:

You can do the same as above, but instead of Environment.SpecialFolder.LocalApplicationData use Environment.SpecialFolder.CommonApplicationData, which by default is C:\ProgramData. So your logs will be in C:\ProgramData\MyAmazingCompany\myamazingproduct.log.

Or -not recommended, but not as horrible as writing in Program Files- you can create something custom like C:\MyAmazingCompany\logs. I’ll be honest with you, it’s ugly, but it works.

But in any case, be careful to consider your environment. Is your software supposed to run on Windows, Linux, Mac, everything? A good place to start is here, for Dot Net, but the concept is the same in every language.

And, also important, make your logging configurable! The location should be changeable via a config file. Different systems have different requirements. Someone will need the logs somewhere special for their own reasons.

But whatever you do, PLEASE PLEASE PLEASE DON’T WRITE WHERE THE BINARY IS. DON’T WRITE IN C:\PROGRAM FILES. IT. DOES. NOT. WORK.

New version of Zoro: 1.0.2

I just published a new version of my open source C# Zoro(*) library in Github and Nuget.org.

Zoro is a data masking/anonymization utility. It fetches data from a database or a CSV file, and creates a CSV file with masked data.

The new version, 1.0.2, has been converted to DotNet Standard 2.0. The command line utility and the test project have been converted to Dotnet Core 5.0.

There is a known issue, not with the code but with the Nuget package. The description claims, as was intended, that the package contains not only the library but also the exe, which can be used as a standalone command line utility. But due to some wrong path in the Github Action, it doesn’t.

I’ll try to get that fixed in the next weeks. Until then, if you need the exe, please checkout the code and build with Visual Studio or Visual Studio Code.

(*) YES NOW I KNOW THIS IS MISSPELLED AND THE CORRECT SPELLING IS ZORRO, I DIDN’T WHEN I STARTED THE LIBRARY, SORRY!

This is the great ideological battle of our time

It used to be capitalism vs. communism, free markets vs. central planning

When people talk about ideologies and political beliefs, the mind usually goes to the left/right political divide, commonly referred to as the “political spectrum“. On one side -and I’m obviously oversimplifying here for the sake of brevity- are the proponents of ideas like central planning, social welfare, big government and redistributive taxation. Names like Marx, Lenin, Rosa Luxemburg, Olof Palme and many others come to mind. On the other side one can find ideas like free markets (and their invisible hand), the law of supply and demand, laissez-faire, entrepreneurship and, generally speaking, as small a government as possible. People mention names like Adam Smith, Keynes, Hayek, Milton Friedman and at the extreme end, Ayn Rand.

In addition to the above, if you follow a more holistic school of thought like e.g. the political compass, you would note that there’s more to politics than the way the state handles people’s money. The social behaviours that are or are not allowed are at least as important, if not more. On one hand you have progressive, permissive societies, like Denmark or the Netherlands, and on the other conservative, authoritarian ones like China or Russia.

So clearly both the economic and the societal dimensions of politics are important. People have not only hotly debated the differences between different forms of government, they have literally given their lives for it.

Today, it’s… a bit different.

It’s not that these differences do not exist, or have been ironed out. There are still people that believe that “workers should own the means of production” (a Marxist thesis), others that “a man’s ego is the fountainhead of human progress” (an Ayn Rand thesis) and others of course everything in between.

But everyone that takes even a casual look at talk shows, or the social networks, actually any medium that hosts public discourse will note that the lines are increasingly being drawn in a different way: those that base their opinions on reason -or at least try- and those that advance or fall prey to conspiracy theories and other instances of unfounded beliefs.

I’ve first come to this realization some months ago, when I noted that, during discussions in a social network, I kept agreeing with a left-leaning doctor while being more of a free-market person myself (though I’m nowhere near to being an Ayn Rand fan). At the same time, the doctor kept disagreeing with people that belonged to the same party -or at least were clearly left-leaning. Why? Both of us were defending government policies that, no matter how much you liked or hated the current Greek government, were based on the medical research that was available at the time. The “others” were bashing the actions of the government… because they were followers of a different political party. Whether the actions of the government were right or wrong, it simply didn’t matter. For them, the opposing party will always be wrong.

I’ve since tried to follow this a bit more closely, and by now it’s clear to me. You see distinguished members of the US Republican Party distance themselves or even denounce Donald Trump. You have virologists that present solid, peer reviewed evidence on COVID-19, and others that rely on unscientific gobbledygook, even flat-out schoolboy math errors, just so they can advance that “COVID is mostly harmless” or “it’s no worse than the flu” (which it bloody isn’t).

Are there always two sides of the story?

Look, let’s be honest here: the last man described as “universalist”, an all-knowing polymath, was Henri PoincarĆ©, and he died more than 100 years ago. Since then, all of us have to please our trust somewhere. We have to believe that someone, in certain areas, tells the truth -and build up on that.

But reducing the matter to just “I believe A, you believe B, why is your belief better than mine?” misses a very big point. At the very minimum, there are beliefs that are validated by reality and ones that are not. To take a simple example: many of us regularly use GPS to help us drive through our city. If you believe that earth is flat, it’s… a little strange to be able use the signal from satellites, isn’t it?

Seems legit

So the conclusion I’ve drawn from all this is the following: sometimes I find myself agreeing with people that normally don’t share my political opinions. Conversely, I might disagree with people that do. That’s fine. At the end, we share the most important political thesis of all: there exists an objective reality and the best way to discover it is to use the scientific method. And yes, reality imposes constraints on us, often unpleasant, and until we find a way to work around them, we need to accept them.

And no amount of YouTube videos can change that.

How to ask for a certificate the right way: CSR via Windows or Keytool with Subject Alternative Names (SANs)

Sooo you’re working in an enterprise and have to maintain an internal server. The security audit asks you to ensure all HTTP communications are encrypted, so you need to change to HTTPS. Boy is this SO not obvious. You’d think this should be quite easy by now, but there are A LOT of pitfalls in your way.

If you want the TL;DR version, to skip the explanation and go directly to the instructions, scroll directly to the Mandalorian below. No hard feelings, honest 😊

Mistake #1: Use a self-signed certificate

Many, many, MANY tutorials you’ll find online are written with a developer in mind, leaving the maintainer/admin as an afterthought -if that. So what they care about is having some certificate, any certificate, as long as it works on the developer’s PC.

But what this certificate says is basically “I’m Jim because I say so”.

Do I need to say that it won’t work for other PCs? Yes? Well surprise, it won’t.

Mistake #2: Get a certificate from your PC’s certificate authority

I don’t know how some people don’t understand that this, while being a bit more complex, it’s basically the same as #1. What this certificate says is “I’m Jim because someone else who is also Jim says so”.

Yeah, no, it won’t work.

Mistake #3: Get a certificate from a trusted certificate authority using only a server name (or an alias).

Now we’re getting more serious.

Getting a certificate from a trusted certificate authority (CA for short) is the right thing to do. The certificate you get then says “I’m Jim because someone else who you already trust says so”.

So if you get a certificate that verifies you’re, say, server web-ch-zh.xyz123.com or mysite.xyz123.com is good enough. Right?

Ummm…

IT DEPENDS.

If you run a website (e.g. https://www.xyz123.com) and want your HTTPS URL to work without giving a certificate warning that’s fine. You don’t need to do anything else. That’s why most tutorials that avoid the self-signed certificate mine stop here.

But remember, our scenario is that we’re working for an enterprise (a big company) and we’re maintaining an internal server. What that usually -not always, but a lot of the time- means is that communication to our server happens using different hostnames.

Let me give you my own example:

  • I run a service called Joint Information Module or JIM for short -that’s a totally real service name [1].
  • The server name is ch-zh-jim-01.mycompany.local.
  • The users use the web interface of the service by navigating to https://jim.mycompany.com.
  • Another application uses the REST API of the service using the server name (ch-zh-jim-01) without the domain name (mycompany.local).
  • The service uses a queuing software that is installed on the same server. We want to use the same certificate for this as well. The JIM service accesses the queues via https://localhost (and a port number).

Now, if the certificate you got says “ch-zh-jim-01.mycompany.local ” and you try to access the server via https://ch-zh-jim-01, https://jim.mycompany.com, https://localhost or https://127.0.0.1, you’ll get a certificate error much like the following:

certificate error chrome

Also, the REST API won’t work. The caller will throw an exception, e.g. java.security.cert.CertPathValidatorException in Java or System.Security.Authentication.AuthenticationException in DotNet. You can avoid this by forcing your code to not care about invalid certificates but this is a) lazy b) bad c) reaaaaaaaaaaly bad, seriously man, don’t do this unless the API you’re connecting to is completely out of your control (e.g. it belongs to a government).

The correct way

So you need a certificate that is trusted and valid for all the names that will be used to communicate with your server. How do you do that? SIMPLEZ!

  1. Generate a CSR (a certificate signing request, which is a small file you send to the CA) with the alternative names (SANs) you need. That’s what I’ll cover here.
  2. Send it to a trusted CA
    1. either the one your own company operates or
    2. a commercial one (which you have to pay), say Digicert.
  3. Get the signed certificate and install it on your software.

Important note: the CA you send the CSR to must support SANs. Not every CA supports this, for their own reasons. Make sure you read their FAQ or ask their helpdesk. Let’s Encrypt, a free and very popular CA, supports them.

Here I’ll show how you can generate a CSR, both in the “Microsoft World” (i.e. on a Windows machine) and in the “Java World” (i.e. on any machine that has Java installed).

A. Using Windows

Note that this is the GUI way to do this. There’s also a command line tool for this, certreq. I won’t cover it here as this post is already quite long, but you can read a nice guide here and Microsoft’s reference here. One thing to note though is that it’s a bit cumbersome to include SANs with this method.

  1. Open C:\windows\System32\certlm.msc (“Local Computer Certificates”).
  2. Expand “Personal” and right click on “Certificates”. Select “All tasks” > “Advanced Operations” > “Create Custom Request”.
  3. In the “Before you begin” page, click Next.
  4. In the “Select Certificate Enrollment Policy” page, click “Proceed without enrollment policy” and then Next.
  5. In the “Custom Request” page, leave the defaults (CNG key / PKCS #10) and click Next.
  6. In the “Certificate Information” page, click on Details, then on Properties.
  7. In the “General” tab:
    1. In the “Friendly Name” field write a short name for your certificate (that has nothing to do with the server). E.g. cert-jim-05-2021.
    2. In the “Description” field, write a description, duh 😊
  8. In the “Subject” tab:
    1. Under “Subject Name” make sure the “Type” is set to “Full DN” and in the Value field paste the following (without the quotes): “CN=ch-zh-jim-01.mycompany.local, OU=IT, O=mycompany, L=Zurich, ST=ZH, C=CH” and click “Add”. Here:
      • Instead of “ch-zh-jim-01.mycompany.local” enter your full server name, complete with domain name. You can get it by typing ipconfig /all in a command prompt (combine Host Name and Primary Dns Suffix).
      • Instead of “IT” and “mycompany” enter your department and company name respectively.
      • Instead of “Zurich”, “ZH” and “CH” enter the city, state (or Kanton or Bundesland or region or whatever) and country respectively.
    2. Under “Alternative Name”:
      1. Change the type to “IP Address (v4)” and in the Value field type “127.0.0.1”. Click “Add”.
      2. Change the type to “DNS” and in the Value field type the following, clicking “Add” every time:
        • localhost
        • ch-zh-jim-01 (i.e. the server name without the default domain)
        • jim.mycompany.com (i.e. the alias that will be normally used)
        • (add as many names as needed)

Important note: all names you enter there must be resolvable (i.e. there’s a DNS entry for the name) by the CA that will generate your certificate. Otherwise there’s no way they can confirm you’re telling the truth and the request will most likely be rejected.

It should end up looking like this:

  1. In the “Extensions” tab, expand “Extended Key Usage (application policies)”. Select “Server Authentication” and “Client Authentication” and click “Add”.
  2. In the “Private Key” tab, expand “Key Options”.
    1. Set the “Key Size” to 2048 (recommended) or higher.
    2. Check the “Mark private key exportable” check box.
    3. (optional, but HIGHLY recommended) Check the “Strong private key protection” check box. This will make the process ask for a certificate password. Avoid only if your software doesn’t support this (although if it does, you really should question if you should be using it!).

At the end, click OK, then Next. Provide a password (make sure you keep it somewhere safe NOT ON A TEXT FILE ON YOUR DESKTOP, YOU KNOW THAT RIGHT???) and save the CSR file. That’s what you have to send to your CA, according to their instuctions.

B. Using Java

Here the process is sooo much simpler:

  1. Open a command prompt (I’m assuming your Java/bin is in the system path; if not, cd to the bin directory of your Java installation). You should have enough permissions to write to your Java security dir; in Windows, that means that you need an administrative command prompt.
  2. Create the certificate. Type the following, in one line, but given here splitted for clarity. Replace as explained below.
keytool
-genkey
-noprompt
-cacerts
-alias cert-jim-05-2021 
-dname "CN=ch-zh-jim-01.mycompany.local, OU=IT, O=mycompany, L=Zurich, ST=ZH, C=CH" 
-keyalg RSA
-keysize 2048
-storepass changeit
-keypass MYSUPERSECRETPASSWORD
  1. Create the certificate signing request (CSR). Type the following, in one line, but given here splitted for clarity. Replace as explained below.
keytool 
-certreq 
-file c:\temp\cert-jim-05-2021.csr 
-cacerts 
-alias cert-jim-05-2021 
-dname "CN=ch-zh-jim-01.mycompany.local, OU=IT, O=mycompany, L=Zurich, ST=ZH, C=CH" 
-ext "SAN=IP:127.0.0.1,DNS:localhost,DNS:ch-zh-jim-01,DNS:jim.mycompany.com" 
-ext "EKU=serverAuth,clientAuth"
-storepass changeit 
-keypass MYSUPERSECRETPASSWORD

In the steps above, you need to replace:

  • “cert-jim-05-2021”, both in the filename and the alias, with your certificate name (which is the short name for your certificate; this has nothing to do with the server itself).
  • “ch-zh-jim-01.mycompany.local” with the full DNS name of your server.
  • “IT” and “mycompany” with your department and company name respectively.
  • “Zurich”, “ZH” and “CH” with your city, state (or Kanton or Bundesland or region or whatever) and country respectively.
  • “ch-zh-jim-01” with your server name (without the domain name).
  • “jim.mycompany.com” with the DNS alias you’re using. You can add as many as needed, e.g. “DNS:jim.mycompany.com,DNS:jim-server.mycompany.com,DNS:jim.mycompany.gr,DNS:jim.mycompany.ch”

Important note: all names you enter there must be resolvable (i.e. there’s a DNS entry for the name) by the CA that will generate your certificate. Otherwise there’s no way they can confirm you’re telling the truth and the request will most likely be rejected.

  • “changeit” is the default password of the Java certificate store (JAVA_HOME/jre/lib/security/cacerts). It should be replaced by the actual password of the certificate store you’re using. But 99.999% of all java installations never get this changed 😊 so if you don’t know otherwise, leave it as it is.
  • “MYSUPERSECRETPASSWORD” is a password for the certificate. Make sure you keep it somewhere safe NOT ON A TEXT FILE ON YOUR DESKTOP, YOU KNOW THAT RIGHT???

That’s it. The CSR is saved in the path you specified (in the “-file” option). That’s what you have to send to your CA, according to their instuctions.

Enjoy!

[1] no it’s not, c’mon

RabbitMQ: How to move configuration, data and log directories on Windows

A good part of my job has to do with enterprise messaging. When a piece of data -a message- needs to be sent from, say, an invoicing system to an accounting system and then to a customer relationship system and then to the customer portal… it has to navigate treacherous waters.

Avast ye bilge-sucking scurvy dogs! A JSON message from accounting says they hornswaggled 1000 doubloons! Aarrr!!!

So we need to make sure that whatever happens, say if a system is overloaded while receiving the message, the message will not be lost.

A key component in this is message queues (MQ), like RabbitMQ. An MQ plays the middleman; it receives a message from a system and stores it reliably until the next system has confirmed that it picked it up.

My daily duties includes setting up, configuring and maintaining a few RabbitMQ instances. It works great! Honestly, so far -for loads up to a couple of 100s of messages per second- I haven’t even had the need to do any serious tuning.

But one thing that annoys me on Windows is that, after installation, the location of everything except the binaries -configuration, data, logs- is under the profile dir of the user (C:\Users\USERNAME\AppData\Roaming\RabbitMQ) that did the installation, even if the service runs as LocalSystem. Not very good, is it?

Therefore I’ve created this script to help me. The easiest way to use it is run it before you install RabbitMQ. Change the directories in this part and run it from an admin powershell:

# ========== Customize here ==========
$BaseLocation = "C:\mqroot\conf"
$DbLocation = "C:\mqroot\db"
$LogLocation = "C:\mqroot\log"
# ====================================

Then just reboot and run the installation normally; when it starts, RabbitMQ will use the directories you specified.

You can also do it after installation, if you have a running instance and want to move it. In this case do the following (you can find these steps also in the script):

  1. Stop the RabbitMQ service.
  2. From Task Manager, kill the epmd.exe process if present.
  3. Go to the existing base dir (usually C:\Users\USERNAME\AppData\Roaming\RabbitMQ)
    and move it somewhere else (say, C:\temp).
  4. Run this script (don’t forget to change the paths).
  5. Reboot the machine
  6. Run the “RabbitMQ Service (re)install” (from Start Menu).
  7. Copy the contents of the old log dir to $LogLocation.
  8. Copy the contents of the old db dir to $DbLocation.
  9. Copy the files on the root of the old base dir (e.g. advanced.config, enabled_plugins) to $BaseLocation.
  10. Start the RabbitMQ service.

Here’s the script. Have fun šŸ™‚

#
# Source: DotJim blog (http://dandraka.com)
# Jim Andrakakis, March 2021
#

# What this script does is:
#   1. Creates the directories where the configuration, queue data and logs will be stored.
#   2. Downloads a sample configuration file (it's necessary to have one).
#   3. Sets the necessary environment variables.

# If you're doing this before installation: 
# Just run it, reboot and then install RabbitMQ.

# If you're doing this after installation, i.e. if you have a 
# running service and want to move its files:
#   1. Stop the RabbitMQ service
#   2. From Task Manager, kill the epmd.exe process if present
#   3. Go to the existing base dir (usually C:\Users\USERNAME\AppData\Roaming\RabbitMQ)
#      and move it somewhere else (say, C:\temp).
#   4. Run this script.
#   5. Reboot the machine
#   6. Run the "RabbitMQ Service (re)install" (from Start Menu)
#   7. Copy the contents of the old log dir to $LogLocation.
#   8. Copy the contents of the old db dir to $DbLocation.
#   9. Copy the files on the root of the old base dir (e.g. advanced.config, enabled_plugins) 
#      to $BaseLocation.
#   10. Start the RabbitMQ service.

# ========== Customize here ==========

$BaseLocation = "C:\mqroot\conf"
$DbLocation = "C:\mqroot\db"
$LogLocation = "C:\mqroot\log"

# ====================================

$exampleConfUrl = "https://raw.githubusercontent.com/rabbitmq/rabbitmq-server/master/deps/rabbit/docs/rabbitmq.conf.example"

Clear-Host
$ErrorActionPreference = "Stop"

$dirList = @($BaseLocation, $DbLocation, $LogLocation)
foreach($dir in $dirList) {
    if (-not (Test-Path -Path $dir)) {
        New-Item -ItemType Directory -Path $dir
    }
}

# If this fails (e.g. because there's a firewall) you have to download the file 
# from $exampleConfUrl manually and copy it to $BaseLocation\rabbitmq.conf
try {
    Invoke-WebRequest -Uri $exampleConfUrl -OutFile ([System.IO.Path]::Combine($BaseLocation, "rabbitmq.conf"))
}
catch {
    Write-Host "(!) Download of conf file failed. Please download the file manually and copy it to $BaseLocation\rabbitmq.conf"
    Write-Host "(!) Url: $exampleConfUrl"
}

&setx /M RABBITMQ_BASE $BaseLocation
&setx /M RABBITMQ_CONFIG_FILE "$BaseLocation\rabbitmq"
&setx /M RABBITMQ_MNESIA_BASE $DbLocation
&setx /M RABBITMQ_LOG_BASE $LogLocation

Write-Host "Finished. Now you can install RabbitMQ."

New version of XMLSlurper: 1.3.0

I just published a new version of my open source C# XmlSlurper library in Github and Nuget.org.

The new version, 1.3.0, contains two major bug fixes:

  1. In previous versions, when the xml contained CDATA nodes, an error was thrown (“Type System.Xml.XmlCDataSection is not supported”). This has been fixed, so now the following works:
<CustomAttributes>
    <Title><![CDATA[DOCUMENTO N. 1234-9876]]></Title>
</CustomAttributes>

This xml can be used as follows:

var cdata = XmlSlurper.ParseText(getFile("CData.xml"));
Console.WriteLine(cdata.Title);
// produces 'DOCUMENTO N. 1234-9876'
  1. In previous versions, when the xml contained xml comments, an error was thrown (“Type System.Xml.XmlComment is not supported”). This has been fixed; the xml comments are now ignored.

Separately, there are a few more changes that don’t impact the users of the library:

  1. A Github action was added that, when the package version changes, automatically builds and tests the project, creates a Nuget package and publishes it to Nuget.org. That will save me quite some time come next version šŸ™‚
  2. The test project was migrated from DotNet Core 2.2 to 3.1.
  3. The tests were migrated from MSTest to xUnit, to make the project able to be developed both in Windows and in Linux -my personal laptop runs Ubuntu.

The new version is backwards compatible with all previous versions. So if you use it, updating your projects is effortless and strongly recommended.

SQL Server: How to shrink your DB Logs (without putting your job at risk)

This post is mostly a reminder for myself šŸ™‚

When your SQL Server DB log files are growing and your disk is close to being full (or, as it happened this morning, fill up completely thus preventing any DB operation whatsoever, bringing the affected system down!) you need to shrink them.

What this means, basically, is that you create a backup (do NOT skip that!) and then you delete information that allows you to recover the database to any point in time before the backup. That’s what SET RECOVERY SIMPLE & DBCC SHRINKFILE do. And since you kept a backup, you no longer need this information. You don’t need it for operations after the backup though, that’s why we go back to full recovery mode with SET RECOVERY FULL at the end.

So what you need is to login to your SQL Server with admin rights and:

USE DatabaseName
GO
BACKUP DATABASE DatabaseName
TO DISK = 'C:\dbbackup\DatabaseName.bak'
   WITH FORMAT,
      MEDIANAME = 'DatabaseNameBackups',
      NAME = 'Full Backup of DatabaseName';
GO
ALTER DATABASE DatabaseName SET RECOVERY SIMPLE;
GO
CHECKPOINT;
GO
DBCC SHRINKFILE ('DatabaseName_Log', 10);
GO
ALTER DATABASE DatabaseName SET RECOVERY FULL;
GO

Notice the 10 there -that’s the size, in MB, that the DB Log file will shrink to. You probably need to change that to match your DB needs. Also, the DatabaseName_Log is the logical name of your DB Log. You can find it in the DB properties. You probably also need to change the backup path from the example C:\dbbackup\DatabaseName.bak.

I’m a donor for Folding@Home (and you can be as well)

I’m not a fan of IT hubris. I cringe -literally- when I hear stuff like “let’s fight cancer (or whatever) with scrum”. You don’t fight diseases with IT; at best, you can help.

But help can be important. One problem that IT is very well suited to solve is understanding how viruses and bacteria behave under certain circumstances. The Folding@Home project explains:

WHAT IS PROTEIN FOLDING AND HOW IS IT RELATED TO DISEASE?
Proteins are necklaces of amino acids, long chain molecules. They are the basis of how biology gets things done. As enzymes, they are the driving force behind all of the biochemical reactions that make biology work. As structural elements, they are the main constituent of our bones, muscles, hair, skin and blood vessels. As antibodies, they recognize invading elements and allow the immune system to get rid of the unwanted invaders. For these reasons, scientists have sequenced the human genome – the blueprint for all of the proteins in biology – but how can we understand what these proteins do and how they work?

However, only knowing this sequence tells us little about what the protein does and how it does it. In order to carry out their function (e.g. as enzymes or antibodies), they must take on a particular shape, also known as a ā€œfold.ā€ Thus, proteins are truly amazing machines: before they do their work, they assemble themselves! This self-assembly is called ā€œfolding.ā€

WHAT HAPPENS IF PROTEINS DON’T FOLD CORRECTLY?
Diseases such as Alzheimer’s disease, Huntington’s disease, cystic fibrosis, BSE (Mad Cow disease), an inherited form of emphysema, and even many cancers are believed to result from protein misfolding. When proteins misfold, they can clump together (ā€œaggregateā€). These clumps can often gather in the brain, where they are believed to cause the symptoms of Mad Cow or Alzheimer’s disease.

The project has made it very easy for anyone to help. You just download and install their software, and your computer starts calculating, solving math problems -essentially, you’re giving your computer’s processing power when you don’t use it. You can see your -and other’s- contribution in the project stats.

I heartily encourage you to do so.

folding at home stats
That’s my HP i7, sitting in the attic and doing what little it can to help beat COVID19.

Powershell & Microsoft Dynamics CRM: get results and update records with paging

I’ve written before an example on how to use Powershell and FetchXml to get records from a Dynamics CRM instance. But there’s a limit, by default 5000 records, on how many records CRM returns in a single batch -and for good reason. There are many blog posts out there on how to increase the limit or even turn it off completely but this is missing the point: you really really really don’t want tens or hundreds of thousand -or, god forbid, millions- of records being returned in a single operation. That would probably fail for a number of reasons, not to mention it would slow the whole system to a crawl for a very long time!

So we really should do it the right way, which is to use paging. It’s not even hard! It’s basically almost the same thing, you just need to add a loop.

That’s the code I wrote to update all active records (the filter is in the FetchXml, so you can just create yours and the code doesn’t change). I added a progress indicator so that I get a sense of performance.

#
# Source: DotJim blog (http://dandraka.com)
# Jim Andrakakis, June 2020
#
# Prerequisites:
# 1. Install PS modules
#    Run the following in a powershell with admin permissions:
#       Install-Module -Name Microsoft.Xrm.Tooling.CrmConnector.PowerShell
#       Install-Module -Name Microsoft.Xrm.Data.PowerShell -AllowClobber
#
# 2. Write password file
#    Run the following and enter your user's password when prompted:
#      Read-Host -assecurestring | convertfrom-securestring | out-file C:\usr\crm\crmcred.pwd
#
# ============ Constants to change ============
$pwdFile = "C:\usr\crm\crmcred.pwd"
$username = "myuser@mycompany.com"
$serverurl = "https://myinstance.crm4.dynamics.com"
$fetchxml = "C:\usr\crm\all_active.xml"
# =============================================

Clear-Host
$ErrorActionPreference = "Stop"

# ============ Login to MS CRM ============
$password = get-content $pwdFile | convertto-securestring
$cred = new-object -typename System.Management.Automation.PSCredential -argumentlist $username,$password
try
{
    $connection = Connect-CRMOnline -Credential $cred -ServerUrl $serverurl
}
catch
{
    Write-Host $_.Exception.Message 
    exit
}
if($connection.IsReady -ne $True)
{
    $errorDescr = $connection.LastCrmError
    Write-Host "Connection not established: $errorDescr"
    exit
}
else
{
    Write-Host "Connection to $($connection.ConnectedOrgFriendlyName) successful"
}

# ============ Fetch data ============
[string]$fetchXmlStr = Get-Content -Path $fetchxml

$list = New-Object System.Collections.ArrayList
# Be careful, NOT zero!
$pageNumber = 1
$pageCookie = ''
$nextPage = $true

$StartDate1=Get-Date

while($nextPage)
{    
    if ($pageNumber -eq 1) {
        $result = Get-CrmRecordsByFetch -conn $connection -Fetch $fetchXmlStr 
    }
    else {
        $result = Get-CrmRecordsByFetch -conn $connection -Fetch $fetchXmlStr -PageNumber $pageNumber -PageCookie $pageCookie
    }

    $EndDate1=Get-Date
    $ts1 = New-TimeSpan –Start $StartDate1 –End $EndDate1

    $list.AddRange($result.CrmRecords)

    Write-Host "Fetched $($list.Count) records in $($ts1.TotalSeconds) sec"    

    $pageNumber = $pageNumber + 1
    $pageCookie = $result.PagingCookie
    $nextPage = $result.NextPage
}


# ============ Update records ============
$StartDate2=Get-Date

$i = 0
foreach($rec in $list) {
    $crmId = $rec.accountid
    $entity = New-Object Microsoft.Xrm.Sdk.Entity("account")
    $entity.Id = [Guid]::Parse($crmId)
    $entity.Attributes["somestringfieldname"] = "somevalue"
    $entity.Attributes["somedatefieldname"] = [datetime]([DateTime]::Now.ToString("u"))
    $connection.Update($entity)
    $i = $i+1
    # this shows progress and time every 1000 records
    if (($i % 1000) -eq 0) {
        $EndDate2=Get-Date
        $ts2 = New-TimeSpan –Start $StartDate2 –End $EndDate2
        Write-Host "Updating $i / $($list.Count) in $($ts2.TotalSeconds) sec"
    }
}

$EndDate2=Get-Date
$ts2 = New-TimeSpan –Start $StartDate2 –End $EndDate2

Write-Host "Updated $($list.Count) records in $($ts2.TotalSeconds) sec"

For my purposes I used the following FetchXml. You can customize it or use CRM’s advanced filter to create yours:

<fetch version="1.0" output-format="xml-platform" mapping="logical" distinct="false">
  <entity name="account">
    <attribute name="accountid" />
    <order attribute="accountid" descending="false" />
    <filter type="and">
      <condition attribute="statecode" operator="eq" value="0" />
    </filter>
  </entity>
</fetch>

Something to keep in mind here is to minimize the amount of data being queried from CRM’s database and then downloaded. Since we’re talking about a lot of records, it’s wise to check your FetchXml and eliminate all fields that are not needed.

Enjoy!