Check out the new features for the IBM Watson apps each week.
Week of 20 March 2018
New names for the Data Science Experience and Data Catalog apps!
The new names better align with new AI features:
- Data Science Experience is now named Watson Studio. See the Watson Studio overview.
- Data Catalog is now named Watson Knowledge Catalog. See the Watson Knowledge Catalog overview.
Machine learning and AI
Image classification with Visual Recognition
You can now use IBM Watson Visual Recognition within Watson Studio to classify images. Visual Recognition uses deep learning algorithms to analyze images for scenes, objects, faces, and other content. You use the Visual Recognition model builder tool to quickly and easily train and test custom models. See Visual Recognition overview.
You can now use deep learning techniques to train thousands of models to identify the right combination of data plus hyperparameters that optimize the performance of your neural networks. You can run more experiments faster. You can train deeper networks and explore broader hyperparameters spaces. Watson Machine Learning accelerates this interactive cycle by simplifying the process to train models in parallel with an on-demand GPU compute cluster. See Deep learning.
You can use the Experiment Assistant tool to define training runs for your experiment and automatically optimize hyperparameters. See Experiment Assistant.
You can use the neural network designer tool to create deep learning flows. Design deep models for the following types of data: image (CNN architecture), as well as text and audio data (RNN architecture). The neural network designer supports 31 types of layers. Any architecture that can be designed using the combination of these 31 layers, can be designed by using the flow modeler and then publish it as a training definition file. See Neural network designer.
You can now create a machine learning flow, which is a graphical representation of a data model, or a deep learning flow, which is a graphical representation of a neural network design, by using the Flow Editor. Use it to prepare or shape data, train or deploy a model, or transform data and export it back to a database table or file in IBM Cloud Object Storage. See Modeler flows.
Create a project with tools specific to your needs
When you create the project, you can now choose the project tile that fits your needs. The tile selection affects the type of assets you can add to the project, the tools you can use, and the IBM Cloud services you need.
You can choose from these tiles when you create a project from the Watson Studio home page:
- Basic: Add collaborators and data assets.
- Complete: All tools are available. You can add services as you need them.
- Data Preparation: Cleanse and shape data.
- Jupyter notebooks: Analyze data with Jupyter notebooks or RStudio.
- Experiment Assistant: Develop neural networks and test them in deep learning experiments.
- Modeler: Build, train, test, and deploy machine learning models.
- Streams Designer: Ingest streaming data.
- Visual Recognition: Classify images.
If you create a project from the My Projects page, your project has all tools.
After you create the project, you can add or remove tools on the Settings page.
See Project tools.
Create dashboards to visualize data without coding
With the analytics dashboard, you can build sophisticated visualizations of your analytics results, communicate the insights that you've discovered in your data on the dashboard, and then share the dashboard with others. See Analytics Dashboard.
Customization support for Python environments
You can customize the software configuration of the Python environments which you create. See Environments.
Watson Knowledge Catalog
Refine catalog data assets
You can now refine data assets that contain relational data after you add them to a project. Projects that you create with Watson Knowledge Catalog include the Data Refinery tool so that you can cleanse and shape data.
Profile documents with unstructured data
Data assets that contain unstructured data, such as Microsoft Word, PDF, HTML, and plain text documents, are automatically profiled by IBM Watson Natural Language Understanding to show the distribution of inferred subject categories, concepts, sentiment, and emotions for the document on the asset’s Profile page. You can also see the profile when you add the asset to a project. See Profile data assets.
Preview PDF documents
You can now see the contents of PDF documents that you add to a catalog on the asset’s Overview page.
Review and rate assets
You can now review and rate an asset, or read reviews by other users in a catalog. View the asset and go to its Reviews page to read reviews or to add a review and a rating.
See recommended and highly rated assets
You can now see recommended and highly rated assets on the Browse page of a catalog:
- Click Watson Recommends to see the top 20 assets that are recommended for you based on attributes common to the assets that you've accessed.
- Click Highly Rated to see the assets that have the highest ratings.
Target file format
If you select a file in a connection as the target for your data flow output, you can now select one of the following formats for that file:
- AVRO - Apache Avro
- CSV - Comma-separated values
- PARQ - Apache Parquet
Week ending 9 March 2018
You'll notice color and font style changes across the IBM Watson apps and tools. These changes align with the style of IBM Cloud to provide a more consistent user experience.
New Watson Analytics connector
Projects and catalogs now support connections to IBM Watson Analytics, enabling you to store data there. (The Watson Analytics connector supports target connections only.)
New documentation locations for DSX Local and DSX Desktop
DSX Local and DSX Desktop documentation is no longer included with the IBM Watson Studio documentation.
The new documentation locations are:
- Data Science Experience Local documentation: https://content-dsxlocal.mybluemix.net/
- Data Science Experience Desktop documentation: https://content-dsxdesktop.mybluemix.net/
PixieDust 1.1.18 adds PixieDebugger
PixieDust release 1.1.8 introduces a visual Python debugger for Jupyter Notebooks: PixieDebugger. It is built as a PixieApp, and includes a source editor, local variable inspector, console output, the ability to evaluate Python expressions in the current context, breakpoints management, and a toolbar for controlling code execution. In addition to debugging traditional notebook cells, PixieDebugger also works to debug PixieApps, which is especially useful when troubleshooting issues with routes. See Release notes for 1.1.8.
Basic date and time support
Some Data Refinery operations now support datetime values.
- Convert column type (support for converting from datetime values only)
- Sort ascending
- Sort descending
Watch for more operations to provide this support in the future!
New Substitute operation
The Substitute operation in the Frequently used category can obscure sensitive information from view by substituting a random string of characters for the actual data in the column.
Import assets from IBM InfoSphere Information Governance Catalog
You can import assets into a catalog from an Information Governance Catalog (IGC) archive file. You must have the Watson Knowledge Catalog Professional plan and have the Admin role in the catalog to import IGC assets. See Import assets from Information Governance Catalog into a catalog.
Week ending 2 March 2018
Apache Spark Service Python 3.5 notebooks now on Anaconda 5.0
The Apache Spark Service upgraded the Anaconda distribution used for Watson Studio notebook environments to Anaconda 5.0. This updated version of Anaconda forces an upgrade to libraries that will change the version for libraries previously installed in the Watson Studio notebook environment. Some libraries updated by this upgrade have changed their APIs, which might cause your existing code to throw warnings or errors.
Note: This update affects only Python 3.5 notebooks. Python 2.7 notebooks do not use the Anaconda distribution.
RStudio and R version upgraded
RStudio in Watson Studio is upgraded to version 1.1.419 and R in RStudio is now version 3.4.3. See the list of many new features that you'll be able to use with RStudio in Watson Studio: RStudio release history. You might have to update some packages to work with the new R version.
Message Hub (Source) operator configuration
Until now, if a streams flow stopped and Message Hub producers continued to send messages to the topic, those messages were retained in the Message Hub queue. When the streams flow restarted, it could not go back in time and consume those lost messages.
Now, in the Properties pane of Message Hub (Source operator), you can select the Resume reading check box to start reading in the Message Hub queue from where the streams flow left off.
You can also configure Default Offset where to begin reading in the Message Hub queue when the streams flow runs for the first time, when Resume reading is not selected, or when the resumption offset is lost. You select to start reading from the latest message or from the earliest message.
User-installed Python libraries
In addition to the supported and pre-installed packages, your streams flow might need other packages for specific work. For these cases, you can install Python packages that are managed by the pip package management system. The packages are found at Python Package Index. By default, pip installs the latest version of a package, but you can install other versions.
In Streams Designer, edit the streams flow that will use the package. Click , and then click Environment.
For details, see Installing other Python libraries.
Week ending 23 February 2018
You can see what your data looked like at any point in time by simply clicking a step in the data flow. This puts Data Refinery into snapshot view. For example, if you click the data source step, you'll see what your data looked like before you started refining it. You can also click any operation step to see what your data looked like after that operation was applied.
New operation descriptions
Data Refinery provides a description for each operation in the Steps tab. (This replaces the R code that was previously displayed.)
Insert, edit, and delete operations in a data flow
Previously, you could delete the last operation step in a data flow. Beginning this week, you can also insert, edit, and delete any operation step in a data flow. See Data flows and steps for more information.
Cancel a data flow run
You can cancel a data flow run when it's in progress, that is, when its status is Running. To cancel a run, select Cancel from the run's menu on the History tab of the Summary and Runs page.
Insert and update rows in relational database targets
If you select an existing relational database table or view as the target for your data flow output, you have a number of options for impacting the existing data set.
- Overwrite - Drops the existing data set and recreates it with the rows in the data flow output
- Truncate - Delete the rows in the existing data set and replace them with the rows in the data flow output
- Insert Only (Append) - Append all rows of the data flow output to the existing data set
- Update Only - Update rows in the existing data set with the data flow output; don’t insert any new rows
- Upsert (Merge) - Update rows in the existing data set and append the rest of the data flow output to it
For the Update Only and Upsert (Merge) options, you'll need to select the columns in the output data set to compare to columns in the existing data set. The output and target data sets must have the same number of columns, and the columns must have the same names and data types in both data sets.
Week ending 16 February 2018
Environments for notebooks (Beta)
In this beta release of environments, you can select default Anaconda environments with different hardware and software configurations for running Jupyter notebooks. You can have more than one environment in a project and then associate these environments with your notebooks depending on the hardware and software requirements of each notebook. See Environments.
View statistics about data assets with personal or restricted information
The Data Dashboard has been extended. You can now check how many data assets contain personal or restricted data. By default, the following classifications are identified: sensitive personal information (SPI), personally identifiable information (PII), or confidential. You can also use your own business terms instead of these classifications. See View the dashboard.
Choose email addresses from a list in the Rule Builder
When you create a rule in the Rule Builder and need to specify email addresses, start typing and then you can choose from a list of matching email adresses.
Week ending 9 February 2018
Create, edit, and delete data flow schedules
When you save or run a new data flow, you can add a one-time or repeating schedule for that data flow. You can subsequently edit or delete the schedule from Data Refinery as well.
Scheduled data flow runs are displayed on the Schedule tab of the Summary and Runs page. Past data flow runs are displayed on the History tab of the same page.
Preview source and target data sets from the Summary and Runs page
You view summary information for a data flow by going to the project > Assets tab > Data flows section and clicking the data flow you're interested in. In the Summary section, you can now preview () both the source and target data sets.
Object Storage OpenStack Swift deprecation
When you create a project, use IBM Cloud Object Storage instead of Object Storage OpenStack Swift.
Object Storage OpenStack Swift is no longer available when you create a project if you access Watson Studio from the US-South Dallas region with the
dataplatform.ibm.com URLs. The Object Storage OpenStack
Swift service is available until the end of March, 2018 in the United Kingdom region with the
eu-gb.dataplatform.ibm.com URLs. Projects with Object Storage OpenStack Swift continue to work.
Easily add Community data sets to a project and notebook
You can add a Community data set to a project by clicking the Add to project button on the data set and selecting a project. Then you can use the Insert to code function for the data set within a notebook. See Load and access data in a notebook.
New Python connector for IBM Cloud Object Storage
You can now use Python connector code in a notebook to load data from and save data to an IBM Cloud Object Storage instance. See Python connectors.
PixieDust 1.1.7 is available
PixieDust release 1.1.7 adds support for aggregate value filtering, updates table visualization, improves Brunel rendering, and has some updated icons. See Release notes for 1.1.7.
Week ending 2 February 2018
New Services menu
The Data Services menu is now the Services menu, with new options to add and manage IBM Cloud AI and compute services, as well as data services. See Add and manage IBM Cloud services.
New canvas design
Check out the new appearance of the Streams Designer canvas! It now has the same look and feel as the Watson Studio common canvas.
Take note of these changes:
The bottom tool bar actions (Settings, Run, Save, Metrics) were moved to the top tool bar.
The Close button is gone. Instead, click in the top tool bar to go to the Metrics page. Or, click the breadcrumbs to return to the Project page.
Autosave is coming! In the meantime, click to save your work.
- Code (in Sources list of operators)
Previously, the Code operator was only a Processing and Analytics type of operator. Now, the Code operator is also available as a Source operator. This operator gives you a convenient way to generate your own sample data or to consume data from an external source.
For details, see Code operator.
- Python Machine Learning (in Processing and Analytics list of operators)
This operator provides a simple way to run Python models of popular frameworks for real time prediction and scoring.
The Python ML operator is based on the Code operator. In addition, it can upload the model file objects from Cloud Object Storage and generate the necessary callbacks in the code.
For details, see Python Machine Learning operator.
Save data flow output as a data asset
You can save data flow output as a new data asset or you can replace an existing data asset. By default, data flow output is saved as a new data asset in the project.
To specify that your data flow output be saved as an existing data asset:
- From the Data flow output pane, click Change Location.
- Select the data asset you want to replace. Note that the target name changes to the name of the existing asset.
- Click Save Location.
Change your column selection in the Operation pane
After you choose an operation, you can change the column that you want to apply the operation to. Just click Change Column Selection at the top of the Operation pane, select a new column, and click Save.
New progress indicator
A progress indicator is now displayed when you choose to refine a data set. The indicator provides useful information about what's going on behind the scenes of Data Refinery.
Week ending 26 January 2018
New Teradata connector
Projects and catalogs now support connections to Teradata, enabling you to access data stored there.
Any collaborator can leave a project
You can leave a project, regardless of your role in it. Previously, only collaborators with the Admin role could leave a project. See Leave a project.
Data sample size
The name of the source file and the number of rows in the data sample are now displayed at the bottom of Data Refinery. (A data sample is the subset of data that's read from the data source and visible in Data Refinery. It enables you to work quickly and efficiently while building your data flow.)
Preview data sources
When you're selecting data to add to Data Refinery, you can now preview a data source before selecting it. Simply click the eye icon () next to the file, table, or view that you want to preview.
Week ending 19 January 2018
See your current account
If you are a user who can access IBM Watson apps in other IBM Cloud accounts because you've been added as a user in those accounts, now you can quickly see which account you are logged into by clicking your profile avatar. The account shows under your user name. You can switch accounts by clicking your avatar and then Settings.
New Dropbox connector
Projects and catalogs now support connections to Dropbox, enabling you to access files stored there. To obtain the application token that's needed to configure a Dropbox connection, follow the instructions in the Dropbox OAuth guide.
Edit and delete capabilities
You can now edit and delete more data policy items:
- Delete business terms: In Business Glossary, you can now delete business terms that are in draft or archived state.
- Delete policies: In Policy Manager, you can delete draft or archived policies.
- Edit rules: You can update rules in published policies if you have the Admin role for the Watson Knowledge Catalog app in the Admin Console. The updated rule applies to all other published policies that contain that rule.
Governance Dashboard is renamed to Data Dashboard
The Governance Dashboard is now called Data Dashboard. If you have the necessary permissions, you can see the Data Dashboard by choosing Catalog > Data Dashboard.
Watson Knowledge Catalog
Discover assets from PostgreSQL
You can now discover assets from connections to PostgreSQL data sources.
Connections to IBM Cloud Object Storage are on the Settings page
You can now see the connections to your IBM Cloud Object Storage instance on the Settings page of the catalog. The connections no longer appear in the list of catalog assets.
Mark connection assets as private
You can now mark a connection asset as private so that only the connection asset members can see and use the connection.
See policy information when assets are blocked
When an asset is blocked by policies, you now see a message that identifies the policy.
Use the new Streams Designer tutorials to gain hands-on experience in designing, running, and troubleshooting your stream flows. You can watch videos or follow along the tutorial to see how easy it is to design and deploy a streams flow. See Tutorials of streams flows.
To enable you to work quickly and efficiently when creating a data flow, Data Refinery operates on a subset of rows in each data set. Beginning this week, the size of that subset is larger (750 KB). This enables you to see more of your data and use more data for interactive cleansing and shaping operations.
PixieDust 1.1.6 is available
PixieDust release 1.1.6 updates the Bokeh version and fixes a Bokeh display problem. PixieApps now automatically collapses dropdowns. See Release notes for 1.1.6.
Week ending 12 January 2018
Library to interact with project assets within notebooks
You can use the pre-installed
project-lib library in Python notebooks to interact with projects and project assets. Using the
project-lib library, you can access project metadata and assets, including files and connections.
The library also contains functions that simplify fetching files from the object storage associated with the project. See Use
project-lib to interact with projects and project assets.
Dive deeper with enhanced flow editor topics
Lists of the Modeler nodes and SparkML nodes now provide you with more detail about each of the node controls and functions. See Creating machine learning flows with SparkML nodes and Creating machine learning flows with SPSS nodes.
Watson Machine Learning topics re-engineered for the way you work
It's a flow thing. We've reworked the order of topics to reflect the way data scientists are using our product. By making use of extensive feedback and leveraging the content on the IBM Cloud we're hoping to make it easier for you to learn as you go. See Watson Machine Learning.
Watson Knowledge Catalog
Recent catalogs in the Catalog menu
You can quickly open catalogs that you've accessed recently from the Catalog menu. Previously, you choose View All Catalogs to go to the Your Catalogs page and then opened the catalog you wanted.
Week ending 5 January 2018
Important information appears in an announcement bar
If there’s an important product update or great new feature that we think you need to know about, it appears in an announcement bar at the top of the screen. You can easily dismiss announcements, and if you want to read them again, click the notification bell to see the notifications log.
Invite project collaborators with an email list
It's easier to invite multiple collaborators to a project. You can paste a list of email addresses that are separated by commas into the Invite field, instead of pressing Enter between each email address.
PixieDust 1.1.5 is available
PixieDust now properly supports Python’s string format operator: %. When you define PixieApp views, you can choose to use Markdown syntax instead of HTML. See Release notes for 1.1.5.
Watson Knowledge Catalog
Improved navigation for data policies
In the Business Glossary, you can use breadcrumbs to quickly jump to previous screens.
In the Policy Manager, you can sort by policy status to find your policies within a category. By default, all published policies are now displayed first.
Week ending 22 December 2017
Pixiedust 1.1.4 is available
PixieApps can now use any third-party plotting library (like Matplotlib) with a route method, and developers can now more easily create PixieApp HTML fragments with Jinja. See Release notes for 1.1.4.
Week ending 15 December 2017
Assign project and catalog creation rights in the Admin Console
Your account members don’t have to be an administrator of a Cloud Object Storage instance to create projects or catalogs. You can now specify which instances of Cloud Object Storage can be used by non-administrators on the Project & Catalog Creation page of the Admin Console.
Watson Knowledge Catalog and data policies
Include more information when importing business terms
When importing business terms to the Business Glossary, you can now add the business definition and the state of the business term to the CSV file you want to import.
You can now sort policies contained in a category by columns, such as by name, status, date, and so on. By default, policies are sorted by status.
GUI operation enhancements
- The Search bar in the Operation list helps you quickly find the operation you're looking for.
- The Join operation in the Organize category can join two data sets in a variety of ways. You can perform a full join, inner join, left join, right join, semi join, or anti join. You can also select the columns you want to see in the result set, and if there are same-named columns between the two data sets, you can specify unique suffixes to differentiate them.
- You no longer need to select a column before clicking the Operation menu. You'll be prompted to make a selection after you choose an operation and only those columns that are appropriate for that operation are selectable.
- A snapshot of the selected columns are shown to the right of the Operation pane so you can see the data while you fill in operation details.
Code operation enhancements
- The command line has new operation- and function-level help to assist you in quickly and easily creating customized operations that you can apply to your data.
- Background highlighting of command line elements provides a visual indicator that syntax, column, and function suggestions are available. Just click the elements to invoke the suggestions.
- Coming soon... template-level help!
Data format specification
When file-based data is read into Data Refinery, if it doesn't look like it should, click the Specify data format icon. To ensure that Data Refinery can correctly read your data, modify the data format assumptions, such as whether the first line contains column headers, what the field delimiter is, and what the quote and escape characters are.
Week ending December 8, 2017
When you sign up for an IBM Watson account, you can now sign up for all the IBM Watson apps that you want in a single screen. See Sign up.
Improved getting started experience
When you sign in to IBM Watson apps, the Get Started information on the landing page shows more key tasks so you can be productive faster.
More control over your services
You no longer provision a Spark service and an object storage service when you sign up for Watson Studio. Instead, when you create a project you provision the object storage type that you want and you can choose whether to include a Spark service in your project. See Set up a project.
View scheduled job details
You can now view details about the scheduled jobs for running notebooks without editing the schedule. While editing the notebook, click the Schedule icon and then choose View job details. See Schedule a notebook.
Watson Knowledge Catalog
General availability for Watson Knowledge Catalog
Watson Knowledge Catalog service is now generally available (GA). Read this blog to learn how to switch your beta catalogs to a GA plan. Read this FAQ to understand what happens to your beta functionality when you switch to a GA plan.
Discover data assets from connections
You can discover assets from a connection, so that all user tables and views accessible from the connection are added as data assets to the project that you select. From the project, you can evaluate each data asset and publish the ones you want to the catalog.
You can discover assets from connections to the following data sources:
- IBM Cloud Object Storage (IaaS)
- IBM Cloud Object Storage
- Db2 on Cloud
- Db2 Warehouse on Cloud
- Microsoft SQL Server
- MySQL on Compose
- Postgres on Compose
Attribute classifier groups for rules
When a data asset is added to a catalog with data policies enforced, it is automatically profiled and classified as part of the data policy framework. The profiling process samples the data asset and leverages different algorithms to determine the type of content in the data asset.
Automatic profiling is based on 163 attribute classifiers provided by IBM. These attribute classifiers are categorized into 12 classifier groups provided by IBM. You can now select one of these classifier groups when defining rules instead of having to select individual attribute classifiers from a long list. For example, if you want to restrict access to a data set that contains personal information, you can select the classifier group Personal Information, which comprises basic attributes of an individual, such as person name, date of birth, and gender. See Classifier groups.
Edit policies and rules
You can now edit published policies to refresh policy details and to add or delete the rules they contain. If you just want to change the name or description of a policy, hover over the information you want to update. To add or delete rules contained in a published policy click Edit to select which rules you want to delete, add, or create. See View or edit policies. To edit category details hover over the name or description of a category.
Week ending December 1, 2017
Improved security: restrict project membership
When you create a project, you can now choose to restrict who can be a collaborator. If you select the Restrict who can be a collaborator checkbox, you can add only members of your IBM Cloud account to the project, or, if your company has SAML federation set up in IBM Cloud, only employees of your company.
The project must be restricted to add catalog assets. See Set up a project.
Existing projects can no longer get assets from catalogs
You can no longer add assets from a catalog to existing projects that are not restricted. However, any catalog assets that you previously added to an unrestricted project remain in the project.
View attribute classifiers for data assets
If you have the Watson Knowledge Catalog app, you can create a profile of a data asset to view the attribute classifiers that are inferred for each column in any data asset in a project. Click the data asset name to see the preview and then click the Profile tab. Click Create Profile to start the profiling process.
PixieDust support for Brunel
PixieDust 1.1.3 supports Brunel as an additional chart-rendering option for feature-rich interactive data visualizations. The Brunel renderer for PixieDust supports all charts types: bar, line, scatter, pie, histogram, and maps. Maps, however, support extra visualization options: heatmap, treemap, and chords. See Release notes for 1.1.3.
Watson Knowledge Catalog
Improved security: restricted membership
Catalog membership is now restricted to members of your IBM Cloud account, or, if your company has SAML federation set up in IBM Cloud, employees of your company. To add catalog assets to a project, the project must be similarly restricted. However, members of unrestricted projects can publish assets to a catalog if they are members of both and have sufficient permissions. See Manage access to a catalog.
View attribute classifiers for data assets
The profile of a data asset shows the attribute classifiers that are inferred for each column in the data set. You can see the profile when you view the asset and click the Profile tab. In catalogs with data policies enforced, data asset profiles are created automatically, based on the first 5000 rows of the data set. In catalogs that do not have data policies enforced, data assets are not profiled automatically. You must create a profile. See Profile data assets.
You can now delete connection assets from a catalog.
Disqualified rows are shown in preview – In the Edit Schema window, you can preview the incoming events based on the defined schema of the source operator. Disqualified values will be shown in red highlight. Preview helps you in two ways:
- You’ll get an indication of which events will be discarded when they don’t comply with the defined schema.
- You’ll save time because you don’t need to run the streams flow in order to discover a mismatch with the schema.
Download logs and link to Streaming Analytics instance - Streams Designer provides a notification panel in the Metrics page which shows any compilation or runtime errors. To further enable you to debug an error, you can now download the logs from the Streaming Analytics instance. In addition, a link is also provided to the Streaming Analytics instance that is used for running of the streams flow.
Automatic restart of Streaming Analytics instance - If the Streaming Analytics instance is stopped while you’re running a streams flow, you can now automatically restart the instance in Streams Designer without having to go IBM Cloud to do so.
If the instance cannot be started (for example, the Lite plan expired and the instance is disabled), you will receive a message with a link to the instance on IBM Cloud.
Indication of an 'unhealthy' running streams flow - When running a streams flow, Streams Designer will now indicate whether the flow is 'unhealthy'. This tells you that there are issues with the running of the flow that should be investigated. Look for errors in the Notification panel or download the logs.
Week ending November 24, 2017
Mention users in comments in notebooks
While editing a notebook, you can mention another user, who is a project collaborator, in a comment. Only that user is notified of the comment. To mention a user in a comment, enter the @ symbol and start entering the user's name until you can
choose it from the search results: for example,
@joe_blue. Then Joe Blue receives a notification that you mentioned him in a comment in a notebook.
Week ending November 17, 2017
Streams Designer in open beta
Use Streams Designer to collect, curate, analyze, and act on massive amounts of changing data in real time. Regardless of whether the data is structured or unstructured, you can leverage data at scale to drive real-time analytics for up-to-the-minute business decisions. See Get started with Streams Designer.
If you participated in the streaming pipelines closed beta, here are the new features for open beta.
Streaming pipelines now has a new name – Streams Designer! The streaming pipeline capability is now called Streams Designer, and the streaming pipeline asset is called streams flow.
Streams Designer in the Tools menu
You can now create a new streams flow directly from the Tools menu.
You associate the new streams flow with a project in the New Streams Flow window.
Support for new IBM Cloud Object Storage (IAM support)
You can now connect and stream data to your IBM Cloud Object Storage instances by using the IBM Cloud Object Storage operator.
Full integration with connections
Streams Designer is now fully integrated with connections. You select your data source by using a connection. Within your streams flow, you can reuse the connections that you defined in the project. Existing connections can be dragged onto the canvas to create pre-connected operators to service instances. You can also create a new connection that can be used in other stream flows in the project.
New DB2 operator
Db2 Warehouse on Cloud is a fully-managed, enterprise-class, cloud data warehouse service. Use the new Db2 Warehouse operator to connect to your Db2 Warehouse on Cloud instances.
Streams Flows 'View All'
In the Assets tab of a project’s Project page, the 'View all' link is shown when there are more than 10 stream flows.
Click the “View all” link to see all stream flows in table or in tile view. You also get useful information about your streams flows, such as status, running time, and more.
A quick guided tour now introduces first-time users to concepts and features in Data Refinery.
- The Trim quotes operation in the Text category can remove single or double quotation marks that enclose text
- The Convert column value to missing operation in the Cleanse category can convert values in the selected column to missing values in one of two ways:
- Column values in the selected column match values in a second, specified column
- Column values in the selected column match a specified value
Data flow enhancements
- You can now save data flow output to a new connection asset by selecting a folder or schema, saving the location, and then providing a new, unique name for the target data set
- The data flow run information now includes new status icons, the number of rows read from the data source and written to the target, and the name of the user who initiated each run
Week ending November 10, 2017
New and enhanced operations
- A new Concatenate string operation in the Text category can link together any string with a column value. You can add the string to the left, right or both sides of the value.
- The Split column operation can now split a column by using a regular expression pattern. This new method joins the text, position, and default methods already in place.
- The Replace substring operation can now perform replacement based on a regular expression pattern, in addition to the already-supported text method.
- The Filter operation now supports multiple conditions on a single filter. You can combine the different conditions with AND or OR operators.
- The Replace missing values operation is now supported for string columns as well as numeric columns.
Watson Machine Learning
New and enhanced operations
- The Flow Editor has new navigation features. For an overview of the changes, you can take the tour, which is available from within the Flow Editor work area when you create a new flow.
- Machine Learning notebooks have been updated to include the most-recent API changes, such as the requirement to use the
Bearertoken for authorization calls.
Week ending November 3, 2017
New IBM Watson apps in open beta
The new IBM Watson apps Watson Knowledge Catalog and Data Refinery are now in open beta. If you have a Watson Studio account, you can try them out for free by clicking your avatar and then Add other apps. Or you can sign up for any of the apps.
New type of project
The new type of project, called an IBM Watson project, has these new features:
- The project uses IBM Cloud Object Storage.
- You can use the new IBM Watson apps with the project.
- You create and edit connections within the project.
To create an IBM Watson project, choose IBM Cloud Object Storage when you create the project. You can continue to create legacy-style projects that use Object Storage OpenStack Swift.
See IBM Watson projects.