Tag Archives: microsoft

Powershell & Microsoft Dynamics CRM: get results and update records for large datasets with paging

I’ve written before an example on how to use Powershell and FetchXml to get records from a Dynamics CRM instance. But there’s a limit, by default 5000 records, on how many records CRM returns in a single batch -and for good reason. There are many blog posts out there on how to increase the limit or even turn it off completely but this is missing the point: you really really really don’t want tens or hundreds of thousand -or, god forbid, millions- of records being returned in a single operation. That would probably fail for a number of reasons, not to mention it would slow the whole system to a crawl for a very long time!

So we really should do it the right way, which is to use paging. It’s not even hard! It’s basically almost the same thing, you just need to add a loop.

So that’s the code I wrote to update all active records (the filter is in the FetchXml, so you can just create yours and the code doesn’t change). I added a progress indicator so that I get a sense of performance.

#
# Source: DotJim blog (http://dandraka.com)
# Jim Andrakakis, June 2020
#
# Prerequisites:
# 1. Install PS modules
#    Run the following in a powershell with admin permissions:
#       Install-Module -Name Microsoft.Xrm.Tooling.CrmConnector.PowerShell
#       Install-Module -Name Microsoft.Xrm.Data.PowerShell -AllowClobber
#
# 2. Write password file
#    Run the following and enter your user's password when prompted:
#      Read-Host -assecurestring | convertfrom-securestring | out-file C:\usr\crm\crmcred.pwd
#
# ============ Constants to change ============
$pwdFile = "C:\usr\crm\crmcred.pwd"
$username = "myuser@mycompany.com"
$serverurl = "https://myinstance.crm4.dynamics.com"
$fetchxml = "C:\usr\crm\all_active.xml"
# =============================================

Clear-Host
$ErrorActionPreference = "Stop"

# ============ Login to MS CRM ============
$password = get-content $pwdFile | convertto-securestring
$cred = new-object -typename System.Management.Automation.PSCredential -argumentlist $username,$password
try
{
    $connection = Connect-CRMOnline -Credential $cred -ServerUrl $serverurl
}
catch
{
    Write-Host $_.Exception.Message 
    exit
}
if($connection.IsReady -ne $True)
{
    $errorDescr = $connection.LastCrmError
    Write-Host "Connection not established: $errorDescr"
    exit
}
else
{
    Write-Host "Connection to $($connection.ConnectedOrgFriendlyName) successful"
}

# ============ Fetch data ============
[string]$fetchXmlStr = Get-Content -Path $fetchxml

$list = New-Object System.Collections.ArrayList
$pageNumber = 1
$pageCookie = ''
$nextPage = $true

$StartDate1=Get-Date

while($nextPage)
{    
    if ($pageNumber -eq 1) {
        $result = Get-CrmRecordsByFetch -conn $connection -Fetch $fetchXmlStr 
    }
    else {
        $result = Get-CrmRecordsByFetch -conn $connection -Fetch $fetchXmlStr -PageNumber $pageNumber -PageCookie $pageCookie
    }

    $EndDate1=Get-Date
    $ts1 = New-TimeSpan –Start $StartDate1 –End $EndDate1

    $list.AddRange($result.CrmRecords)

    Write-Host "Fetched $($list.Count) records in $($ts1.TotalSeconds) sec"    

    $pageNumber = $pageNumber + 1
    $pageCookie = $result.PagingCookie
    $nextPage = $result.NextPage
}


# ============ Update records ============
$StartDate2=Get-Date

$i = 0
foreach($rec in $list) {
    $crmId = $rec.accountid
    $entity = New-Object Microsoft.Xrm.Sdk.Entity("account")
    $entity.Id = [Guid]::Parse($crmId)
    $entity.Attributes["somestringfieldname"] = "somevalue"
    $entity.Attributes["somedatefieldname"] = [datetime]([DateTime]::Now.ToString("u"))
    $connection.Update($entity)
    $i = $i+1
    # this shows progress and time every 1000 records
    if (($i % 1000) -eq 0) {
        $EndDate2=Get-Date
        $ts2 = New-TimeSpan –Start $StartDate2 –End $EndDate2
        Write-Host "Updating $i / $($list.Count) in $($ts2.TotalSeconds) sec"
    }
}

$EndDate2=Get-Date
$ts2 = New-TimeSpan –Start $StartDate2 –End $EndDate2

Write-Host "Updated $($list.Count) records in $($ts2.TotalSeconds) sec"

For my purposes I used the following FetchXml. You can customize it or use CRM’s advanced filter to create yours:

<fetch version="1.0" output-format="xml-platform" mapping="logical" distinct="false">
  <entity name="account">
    <attribute name="accountid" />
    <order attribute="accountid" descending="false" />
    <filter type="and">
      <condition attribute="statecode" operator="eq" value="0" />
    </filter>
  </entity>
</fetch>

Something to keep in mind here is to minimize the amount of data being queried from CRM’s database and then downloaded. Since we’re talking about a lot of records, it’s wise to check your FetchXml and eliminate all fields that are not needed.

Enjoy!

Powershell: How do you add inline C#?

Powershell is great for admin tasks. Stuff like iterating through files and folders, copying and transforming files are very, very easily done. But inevitably there will always be stuff that are easier to do via a “normal” language such as C#.

Trying to solve a problem I had at work, I needed to transform a CSV file by changing the fields -which is easily done via powershell- and, at the same time, do a “get only the highest record of every group”. This is done with LINQ, which you can use in powershell but it’s cumbersome and will result in many, many lines of code.

So I wanted to do this in a more clean way, in C#. The general template to include C# inside a powershell script is the following:

#
# Source: DotJim blog (http://dandraka.com)
# Jim Andrakakis, November 2018
#
# Here goes the C# code:
Add-Type -Language CSharp @"
using System; 
namespace DotJim.Powershell 
{
    public static class Magician 
    {
        private static string spell = ""; 
        public static void DoMagic(string magicSpell) 
        {
            spell = magicSpell; 
        }
        public static string GetMagicSpells() 
        {
            return "Wingardium Leviosa\r\n" + spell; 
        }
    }
}
"@;

# And here's how to call it:
[DotJim.Powershell.Magician]::DoMagic("Expelliarmus")
$spell = [DotJim.Powershell.Magician]::GetMagicSpells()

Write-Host $spell

Note here that the C# classes don’t have to be static; but if they are, they’re easier to call (no instantiation needed). Of course this only works if all you need to do is provide an input and get a manipulated output. If you need more complex stuff then yes, you can use non-static classes or whatever C# functionality solves your problems. Here’s the previous example, but with a non-static class:

#
# Source: DotJim blog (https://dandraka.com)
# Jim Andrakakis, November 2018
#
# Here goes the C# code:
Add-Type -Language CSharp @"
using System; 
namespace DotJim.Powershell 
{
    public class Magician 
    {
        private string spell = ""; 
        public void DoMagic(string magicSpell) 
        {
            spell = magicSpell; 
        }
        public string GetMagicSpells() 
        {
            return "Wingardium Leviosa\r\n" + spell; 
        }
    }
}
"@;

# Here's how to create an instance:
$houdini = New-Object -TypeName DotJim.Powershell.Magician
# And here's how to call it:
$houdini.DoMagic("Expelliarmus")
$spell = $houdini.GetMagicSpells()

Write-Host $spell

The main advantage of having C# inside the powershell script (and not in a separate dll file) is that it can be deployed very easily with various Devops tools. Otherwise you need to deploy the dll alongside which can, sometimes, be the source of trouble.

So here’s my complete working code, which worked quite nicely:

#
# Source: DotJim blog (http://dandraka.com)
# Jim Andrakakis, November 2018
#
# The purpose of this script is to read a CSV file with bank data
# and transform it into a different CSV.
#
# 1. The Bank class is a POCO to hold the data which I need
#    from every line of the CSV file.
# 2. The Add() method of the BankAggregator class adds the
#    record to the list after checking the data for correctness.
# 3. The Get() methof of the BankAggregator class does a
#    LINQ query to get the 1st (max BankNr) bank record
#    from every record with the same Country/BIC.
#    It then returns a list of strings, formatted the way
#    I want for the new (transformed) CSV file.
#
# Here is where I inline the C# code:
Add-Type -Language CSharp @"
using System;
using System.Collections.Generic;
using System.Linq;
namespace DotJim.Powershell {
 public class Bank {
  public int BankNr;
  public string Country;
  public string BIC;
 }
 public static class BankAggregator {
  private static List list = new List();
  public static void Add(string country, string bic, string bankNr) {
   //For debugging
   //Console.WriteLine(string.Format("{0}{3}{1}{3}{3}{2}", country, bic, bankNr, ";"));
   int mBankNr;
   // Check data for correctness, discard if not ok
   if (string.IsNullOrWhiteSpace(country) ||
    country.Length != 2 ||
    string.IsNullOrWhiteSpace(bic) ||
    string.IsNullOrWhiteSpace(bankNr) ||
    !int.TryParse(bankNr, out mBankNr) ||
    mBankNr & gt; = 0) {
    return;
   }
   list.Add(new Bank() {
    BankNr = mBankNr, Country = country, BIC = bic
   });
  }
  public static List Get(string delimiter) {
   // For every record with the same Country & BIC, keep only
   // the record with the highest BankNr
   var bankList = from b in list
   group b by new {
    b.Country, b.BIC
   }
   into bankGrp
   let maxBankNr = bankGrp.Max(x = & gt; x.BankNr)
   select new Bank {
    Country = bankGrp.Key.Country,
     BIC = bankGrp.Key.BIC,
     BankNr = maxBankNr
   };
   // Format the list the way I want the new CSV file to look
   return bankList.Select(x = & amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; amp; gt; string.Format("{0}{3}{1}{3}{3}{2}",
    x.Country, x.BIC, x.BankNr, delimiter)).ToList();
  }
 }
}
"@;

# Read one or more files with bank data from the same dir
# where the script is located ($PSScriptRoot)
$srcSearchStr = "source_bankdata*.csv"
$SourcePath = $PSScriptRoot
$destPath = $SourcePath

$fields = @("Country","BIC","EmptyField","BankId")

$filesList = Get-ChildItem -Path $SourcePath -Filter $srcSearchStr

foreach ($file in $filesList)
{
Write-Host "Processing" $file.FullName

# Fields in the source CSV:
# BANKNUMMER  = BankNr
# BANKLAND    = Country
# BANKSWIFT   = BIC
$data = Import-Csv -Path $file.FullName -Delimiter ";"

foreach ($item in $data)
{
# Call the C# code to add the CSV lines to the list
[DotJim.Powershell.BankAggregator]::Add($item.BANKLAND,$item.BANKSWIFT,$item.BANKNUMMER)
}

# Call the C# code to get the transformed data
$list = [DotJim.Powershell.BankAggregator]::Get(";")

Write-Host "Found" $list.Count "valid rows"

# Now that we have the list, write it in the new CSV
Out-File -FilePath "$destPath\transformed_bankdata_$(New-Guid).csv" -Encoding UTF8 -InputObject $list
}

Have fun coding!

Powershell & Microsoft Dynamics CRM: how to get results using a FetchXml

[Update June 2020] There’s a newer post that does the same as this and is more complete -it includes paging and updating records. You might want to check it out here.

If you’ve used Microsoft CRM as a power user (on-premise or online), chances are you’ve come across the standard way of querying CRM data, FetchXml.

You can run this by hand but of course the real power of it is using it to automate tasks. And another great way to automate tasks in Windows is, naturally, powershell.

So here’s a script I’m using to run a fetch xml and export the results to a csv file:

#
# Source: DotJim blog (http://dandraka.com)
# Jim Andrakakis, May 2018
#
# Prerequisites:
# 1. Install PS modules
#    Run the following in a powershell with admin permissions:
#       Install-Module -Name Microsoft.Xrm.Tooling.CrmConnector.PowerShell
#       Install-Module -Name Microsoft.Xrm.Data.PowerShell -AllowClobber
#
# 2. Write password file
#    Run the following and enter your user's password when prompted:
#      Read-Host -assecurestring | convertfrom-securestring | out-file C:\temp\crmcred.pwd
#
# ============ Constants to change ============
$pwdFile = "C:\temp\crmcred.pwd"
$username = "myusername@mycompany.com"
$serverurl = "https://my-crm-instance.crm4.dynamics.com"
$fetchXmlFile = "c:\temp\fetch.xml"
$exportfile = "C:\temp\crm_export.csv"
$exportdelimiter = ";"
# =============================================
# ============ Login to MS CRM ============
$password = get-content $pwdFile | convertto-securestring
$cred = new-object -typename System.Management.Automation.PSCredential -argumentlist $username,$password
try
{
    $connection = Connect-CRMOnline -Credential $cred -ServerUrl $serverurl
    # for on-prem use :
    #   $connection = Connect-CrmOnPremDiscovery -Credential $cred -ServerUrl $serverurl
    # you can also use interactive mode if you get e.g. problems with multi-factor authentication
    #   $connection = Connect-CrmOnlineDiscovery -InteractiveMode -Credential $cred
    # or you can use a connection string if you want to use e.g. OAuth or a Client Secret
    # but then the password must be plaintext which is kind of a security no-no
    #   $connString = "AuthType=ClientSecret;url=$serverurl;ClientId=$username;ClientSecret=$password"
    #   $connection = Get-CrmConnection -ConnectionString $connString
}
catch
{
    Write-Host $_.Exception.Message
    exit
}
if($connection.IsReady -ne $True)
{
    $errorDescr = $connection.LastCrmError
    Write-Host "Connection not established: $errorDescr"
    exit
}
else
{
    Write-Host "Connection to $($connection.ConnectedOrgFriendlyName) successful"
}
# ============ Fetch data ============
$fetchXml = [xml](Get-Content $fetchXmlFile)
$result = Get-CrmRecordsByFetch -conn $connection -Fetch $fetchXml.OuterXml
# ============ Write to file ============
# Obviously here, instead of writing to csv directly, you can loop and do whatever suits your needs, e.g. run a db query, call a web service etc etc
$result.CrmRecords | Select -Property lastname, firstname | Export-Csv -Encoding UTF8 -Path $exportfile -NoTypeInformation -Delimiter $exportdelimiter

When you use your own FetchXml, do remember to change the properties in the last line (lastname, firstname).

For a quick test, the example FetchXml I’m using is the following:

<fetch mapping="logical" version="1.0">
    <entity name="account">
        <attribute name="customertypecode" alias="customertypecode"/>
        <attribute name="name" alias="company_name"/>
        <attribute name="emailaddress1" alias="company_emailaddress1"/>
        <link-entity name="contact" from="accountid" to="accountid" link-type="inner">
            <attribute name="lastname" alias="lastname"/>
            <attribute name="firstname" alias="firstname"/>
        </link-entity>
    </entity>
</fetch>

Have fun coding!

Do execution plans change when using different filter values?

(short answer: yes!)

Anyone who develops software that interacts with a database knows (read: should know) how to read a query execution plan, given by “EXPLAIN PLAN”, and how to avoid at least the most common problems like a full table scan.

It is obvious that a plan can change if the database changes. For example if we add an index that is relevant to our query, it will be used to make our query faster. And this will be reflected in the new plan.

Likewise if the query changes. If instead of

SELECT * FROM mytable WHERE somevalue > 5

the query changes to

SELECT * FROM mytable WHERE somevalue IN 
  (SELECT someid FROM anothertable)

the plan will of course change.

So during a database performance tuning seminar at work, we came to the following question: can the execution plan change if we just change the filter value? Like, if instead of

SELECT * FROM mytable WHERE somevalue > 5

the query changes to

SELECT * FROM mytable WHERE somevalue > 10

It’s not obvious why it should. The columns used, both in the SELECT and the WHERE clause, do not change. So if a human would look at these two queries, they would select the same way of executing them (e.g. using an index on somevalue if one is available).

But databases have a knowledge we don’t have. They have statistics.

Let’s do an example. We’ll use Microsoft SQL server here. The edition doesn’t really matter, you can use Express for example. But the idea, and the results, are the same for Oracle or any other major RDBMS.

First off, let’s create a database. Open Management Studio and paste the following (changing the paths as needed):

CREATE DATABASE [PLANTEST]  
CONTAINMENT = NONE  
ON  PRIMARY  
( NAME = N'PLANTEST',  
FILENAME = N'C:\DATA\PLANTEST.mdf' ,  
SIZE = 180MB , FILEGROWTH = 10% )  
LOG ON  
( NAME = N'PLANTEST_log',  
FILENAME = N'C:\DATA\PLANTEST_log.ldf' ,  
SIZE = 20MB , FILEGROWTH = 10%) 
GO

Note that, by default, I’ve allocated a lot of space, 180MB. There’s a reason for that; We know that we’ll pump in a lot of data, and we want to avoid the delay of the db files growing.

Now let’s create a table to work on:

USE PLANTEST 
GO 
CREATE TABLE dbo.TESTWORKLOAD 
( testid int NOT NULL IDENTITY(1,1), 
testname char(10) NULL, 
testdata nvarchar(36) NULL )  
ON [PRIMARY] 
GO 

And let’s fill it (this can take some time, say around 5-10 minutes):

DECLARE @cnt1 INT = 0;
DECLARE @cnt2 INT = 0;

WHILE @cnt1 < 20
BEGIN
	SET @cnt2 = 0;
	WHILE @cnt2 < 100000
	BEGIN
	   insert into TESTWORKLOAD (testname, testdata) 
             values ('COMMON0001', CONVERT(char(36), NEWID()));
	   SET @cnt2 = @cnt2 + 1;
	END;
	insert into TESTWORKLOAD (testname, testdata) 
          values ('SPARSE0002', CONVERT(char(36), NEWID()));
	SET @cnt1 = @cnt1 + 1;
END;
GO

What I did here is, basically, I filled the table with 2 million (20 * 100000) plus 20 rows. Almost all of them (2 million) in the testname field, have the value “COMMON0001”. But a few, only 20, have a different value, “SPARSE0002”.

Essentially the table is our proverbial haystack. The “COMMON0001” rows are the hay, and the “SPARSE0002” rows are the needles 🙂

Let’s examine how the database will execute these two queries:

SELECT * FROM TESTWORKLOAD WHERE testname = 'COMMON0001';
SELECT * FROM TESTWORKLOAD WHERE testname = 'SPARSE0002';

Select both of them and, in management studio, press Control+L or the “Display estimated execution plan” button. What you will see is this:

What you see here is that both queries will do a full table scan. That means that the database will go and grab every single row from the table, look at the rows one by one, and give us only the ones who match (the ones with COMMON0001 or SPARSE0002, respectively).

That’s ok when you don’t have a lot of rows (say, up to 5 or 10 thousand), but it’s terribly slow when you have a lot (like our 2 million).

So let’s create an index for that:

CREATE NONCLUSTERED INDEX [IX_testname] ON [dbo].[TESTWORKLOAD]
(
	[testname] ASC
)
GO

And here’s where you watch the magic happen. Select the same queries as above and press Control+L (or the “Display estimated execution plan” button) again. Voila:

What you see here is that, even though the only difference between the two queries is the filter value, the execution plan changes.

Why does this happen? And how?

Well, here’s where statistics are handy. On the Object Explorer of management studio, expand (the “+”) our database and table, and then the “Statistics” folder.

You can see the statistic for our index, IX_testname. If you open it (double click and then go to “details”) you see the following:

So (I’m simplifying a bit here, but not a lot) the database knows how many rows have the value “COMMON0001” (2 million) and how many the value “SPARSE0002” (just 20).

Knowing this, it concludes (that’s the job of the query optimizer) that the best way to execute the 2 queries is different:

The first one (WHERE testname = ‘COMMON0001’) will return almost all the rows of the table. Knowing this, the optimizer decides that it’s faster to just get everything (aka Full Table Scan) and filter out the very few rows we don’t need.

For the second one (WHERE testname = ‘SPARSE0002’), things are different. The optimizer knows that it’s looking only for a few rows, and it’s smartly using the index to find them as fast as possible.

In plain English, if you want the hay out of a haystack, you just get the whole stack. But if you’re looking for the needles, you go find them one by one.

Oh the joy of coding with German Office

So I’m maintaining an Access database, with lots of VBA code, which serves as an internal management tool.

Changing a report should be, and usually is, pretty straight forward. But there I am today, after all changes are done, stuck trying to understand WHY THE FREAKING F*** a format string doesn’t work.

The offender is a text box with a date, and I’m trying to get it to be displayed like Nov-2015. So, according to Microsoft’s documentation, the format string is mmm-yyyy. Needless to say, didn’t work.

After an hour or so of banging my head on the wall, enlightenment comes: Year in German is Jahr !!! So mmm-jjjj, and, pronto, it worked like a charm.

By the way, Microsoft’s german doc is wrong (seems like a copy-paste error): it mentions jj and yyyy, instead of jj and jjjj which work (giving 15 and 2015 respectively).

GRRRRRRRRRRRRRRRRRRR *#@$%