I’m a donor for Folding@Home (and you can be as well)

I’m not a fan of IT hubris. I cringe -literally- when I hear stuff like “let’s fight cancer (or whatever) with scrum”. You don’t fight diseases with IT; at best, you can help.

But help can be important. One problem that IT is very well suited to solve is understanding how viruses and bacteria behave under certain circumstances. The Folding@Home project explains:

WHAT IS PROTEIN FOLDING AND HOW IS IT RELATED TO DISEASE?
Proteins are necklaces of amino acids, long chain molecules. They are the basis of how biology gets things done. As enzymes, they are the driving force behind all of the biochemical reactions that make biology work. As structural elements, they are the main constituent of our bones, muscles, hair, skin and blood vessels. As antibodies, they recognize invading elements and allow the immune system to get rid of the unwanted invaders. For these reasons, scientists have sequenced the human genome – the blueprint for all of the proteins in biology – but how can we understand what these proteins do and how they work?

However, only knowing this sequence tells us little about what the protein does and how it does it. In order to carry out their function (e.g. as enzymes or antibodies), they must take on a particular shape, also known as a “fold.” Thus, proteins are truly amazing machines: before they do their work, they assemble themselves! This self-assembly is called “folding.”

WHAT HAPPENS IF PROTEINS DON’T FOLD CORRECTLY?
Diseases such as Alzheimer’s disease, Huntington’s disease, cystic fibrosis, BSE (Mad Cow disease), an inherited form of emphysema, and even many cancers are believed to result from protein misfolding. When proteins misfold, they can clump together (“aggregate”). These clumps can often gather in the brain, where they are believed to cause the symptoms of Mad Cow or Alzheimer’s disease.

The project has made it very easy for anyone to help. You just download and install their software, and your computer starts calculating, solving math problems -essentially, you’re giving your computer’s processing power when you don’t use it. You can see your -and other’s- contribution in the project stats.

I heartily encourage you to do so.

That’s my HP i7, sitting in the attic and doing what little it can to help beat COVID19.

Powershell & Microsoft Dynamics CRM: get results and update records for large datasets with paging

I’ve written before an example on how to use Powershell and FetchXml to get records from a Dynamics CRM instance. But there’s a limit, by default 5000 records, on how many records CRM returns in a single batch -and for good reason. There are many blog posts out there on how to increase the limit or even turn it off completely but this is missing the point: you really really really don’t want tens or hundreds of thousand -or, god forbid, millions- of records being returned in a single operation. That would probably fail for a number of reasons, not to mention it would slow the whole system to a crawl for a very long time!

So we really should do it the right way, which is to use paging. It’s not even hard! It’s basically almost the same thing, you just need to add a loop.

So that’s the code I wrote to update all active records (the filter is in the FetchXml, so you can just create yours and the code doesn’t change). I added a progress indicator so that I get a sense of performance.

#
# Source: DotJim blog (http://dandraka.com)
# Jim Andrakakis, June 2020
#
# Prerequisites:
# 1. Install PS modules
#    Run the following in a powershell with admin permissions:
#       Install-Module -Name Microsoft.Xrm.Tooling.CrmConnector.PowerShell
#       Install-Module -Name Microsoft.Xrm.Data.PowerShell -AllowClobber
#
# 2. Write password file
#    Run the following and enter your user's password when prompted:
#      Read-Host -assecurestring | convertfrom-securestring | out-file C:\usr\crm\crmcred.pwd
#
# ============ Constants to change ============
$pwdFile = "C:\usr\crm\crmcred.pwd"
$username = "myuser@mycompany.com"
$serverurl = "https://myinstance.crm4.dynamics.com"
$fetchxml = "C:\usr\crm\all_active.xml"
# =============================================

Clear-Host
$ErrorActionPreference = "Stop"

# ============ Login to MS CRM ============
$password = get-content $pwdFile | convertto-securestring
$cred = new-object -typename System.Management.Automation.PSCredential -argumentlist $username,$password
try
{
    $connection = Connect-CRMOnline -Credential $cred -ServerUrl $serverurl
}
catch
{
    Write-Host $_.Exception.Message 
    exit
}
if($connection.IsReady -ne $True)
{
    $errorDescr = $connection.LastCrmError
    Write-Host "Connection not established: $errorDescr"
    exit
}
else
{
    Write-Host "Connection to $($connection.ConnectedOrgFriendlyName) successful"
}

# ============ Fetch data ============
[string]$fetchXmlStr = Get-Content -Path $fetchxml

$list = New-Object System.Collections.ArrayList
# Be careful, NOT zero!
$pageNumber = 1
$pageCookie = ''
$nextPage = $true

$StartDate1=Get-Date

while($nextPage)
{    
    if ($pageNumber -eq 1) {
        $result = Get-CrmRecordsByFetch -conn $connection -Fetch $fetchXmlStr 
    }
    else {
        $result = Get-CrmRecordsByFetch -conn $connection -Fetch $fetchXmlStr -PageNumber $pageNumber -PageCookie $pageCookie
    }

    $EndDate1=Get-Date
    $ts1 = New-TimeSpan –Start $StartDate1 –End $EndDate1

    $list.AddRange($result.CrmRecords)

    Write-Host "Fetched $($list.Count) records in $($ts1.TotalSeconds) sec"    

    $pageNumber = $pageNumber + 1
    $pageCookie = $result.PagingCookie
    $nextPage = $result.NextPage
}


# ============ Update records ============
$StartDate2=Get-Date

$i = 0
foreach($rec in $list) {
    $crmId = $rec.accountid
    $entity = New-Object Microsoft.Xrm.Sdk.Entity("account")
    $entity.Id = [Guid]::Parse($crmId)
    $entity.Attributes["somestringfieldname"] = "somevalue"
    $entity.Attributes["somedatefieldname"] = [datetime]([DateTime]::Now.ToString("u"))
    $connection.Update($entity)
    $i = $i+1
    # this shows progress and time every 1000 records
    if (($i % 1000) -eq 0) {
        $EndDate2=Get-Date
        $ts2 = New-TimeSpan –Start $StartDate2 –End $EndDate2
        Write-Host "Updating $i / $($list.Count) in $($ts2.TotalSeconds) sec"
    }
}

$EndDate2=Get-Date
$ts2 = New-TimeSpan –Start $StartDate2 –End $EndDate2

Write-Host "Updated $($list.Count) records in $($ts2.TotalSeconds) sec"

For my purposes I used the following FetchXml. You can customize it or use CRM’s advanced filter to create yours:

<fetch version="1.0" output-format="xml-platform" mapping="logical" distinct="false">
  <entity name="account">
    <attribute name="accountid" />
    <order attribute="accountid" descending="false" />
    <filter type="and">
      <condition attribute="statecode" operator="eq" value="0" />
    </filter>
  </entity>
</fetch>

Something to keep in mind here is to minimize the amount of data being queried from CRM’s database and then downloaded. Since we’re talking about a lot of records, it’s wise to check your FetchXml and eliminate all fields that are not needed.

Enjoy!